Caching
Grimp uses a file-based cache to speed up subsequent builds of the graph:
>>> build_graph("somepackage", "anotherpackage") # Writes to a cache for the first time.
...
>>> build_graph("somepackage", "anotherpackage") # Second time it's run, it's much quicker.
What is cached?
Grimp caches the imports discovered through static analysis of the packages when it builds a graph.
It does not cache the results of any methods called on a graph, e.g. find_downstream_modules
.
Separate caches of imports are created depending the arguments passed to build_graph
. For example,
the following invocations will each have a separate cache and will not be able to make use of each
other’s work:
build_graph("mypackage")
build_graph("mypackage", "anotherpackage")
build_graph("mypackage", "anotherpackage", include_external_packages=True)
build_graph("mypackage", "anotherpackage", exclude_type_checking_imports=True)
Grimp can make use of cached results even if some of the modules change. For example,
if mypackage.foo
is changed, but all the other modules within mypackage
are left
untouched, Grimp will only need to rescan mypackage.foo
. This can have a significant
speed up effect when analysing large codebases in which only a small subset of files change
from run to run.
Grimp determines whether or not it needs to rescan a file based on its last modified time. This makes it very effective for local development, but is less effective in environments that reinstall the package under analysis between each build of the graph (e.g. on a continuous integration server).
Location of the cache
Cache files are written, by default, to a .grimp_cache
directory
in the current working directory. This directory can be changed by passing
cache_dir
to the build_graph
function, e.g.:
graph = grimp.build_graph("mypackage", cache_dir="/path/to/cache")
Disabling caching
To skip using (and writing to) the cache, pass cache_dir=None
to build_graph
:
graph = grimp.build_graph("mypackage", cache_dir=None)
Concurrency
Caching isn’t currently concurrency-safe. Specifically, if you have two concurrent processes writing to the same cache files, you might experience incorrect behaviour.