添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

I often do interactive data analysis in notebooks, using tools that I am also developing. Naturally, I often find a bug or missing feature in one of my functions, right at the very end of the analysis.

After fixing myfunction , say, I want to avoid running everything again (since that might take a very long time) and therefore turn to importlib.reload :

import mymodule
importlib.reload(mymodule)
from mymodule import myfunction

I think it would be nice if that was just myfunction = importlib.reload(myfunction).

Currently reload() refuses anything that’s not a module, but why not have it look up the __module__ and __qualname__ when given an object, and automate the import whenever possible?

Nicolas Tessore:

Currently reload() refuses anything that’s not a module, but why not have it look up the __module__ and __qualname__ when given an object, and automate the import whenever possible?

Keep in mind that from x import y works with any attribute y, not just functions (or classes). Some of those won’t have those attributes, either.

Neil Girdhar:

Good idea, but why not go one step further and add a magic %reload to the notebook code so that you can do %reload(myfunction)?

Sure, that’s a good idea for added convenience in Jupiter notebooks! But my point stands that reload() could be made to understand more than just modules.

Nicolas Tessore:

But my point stands that reload() could be made to understand more than just modules.

Yeah, but you’re broadening and complicating the interface. Since your argument was to improve the notebook experience, I think the easiest place to do that is in the notebook code.

My initial thought was that this was an interesting idea worth pursuing, but alas there’s a problem:

  • In general, reload cannot tell what name to import from the module. The best it can do is guess, which is risky.
  • reload receives the myfunction object as its argument, not the name. Depending on what the object is, it may or may not have a __module__ attribute. If it does, then reload could reload that module. But then it’s stuck: how can it determine which name to import?

    The name “myfunction” is not accessible to reload. The best it could do is inspect the object for a __name__ attribute, and guess that importing that name will Do What You Mean. But this is fragile and error-prone, and relies on implementation details of myfunction.

    Like all DWIM systems when it goes wrong is will lead to problems, in this case returning the wrong object.

    Such guessing functions are best left for your own personal toolkit, where you have nobody to blame but yourself if it returns the wrong object, rather than parts of the language.

    >>> from mymodule import* myfunction

    The import* will act as a regular import and also works like reload.
    It would be also helpful when the “script.py” is reloaded and the dependent modules such as “mymodule” should be reloaded too.

    Steven D'Aprano:

    Such guessing functions are best left for your own personal toolkit, where you have nobody to blame but yourself if it returns the wrong object, rather than parts of the language.

    This is not any more risky than other parts for which the name dunders are already used, e.g. pickling a function.

    Steven D'Aprano:

    The name “myfunction” is not accessible to reload. The best it could do is inspect the object for a __name__ attribute, and guess that importing that name will Do What You Mean. But this is fragile and error-prone, and relies on implementation details of myfunction.

    I don’t think this is fundamentally different from what it’s already doing when it reloads a module: it looks for a specially-named attribute (__name__), which is a string specifying a module name, and guesses that importing that name will re-create a module that is conceptually the same as the module that was passed in.

    That can, in principle, be defeated: create and import a module; then manipulate sys.path such that a different .py file with the same name will be found first; then modify the original module and attempt to reload it. Instead of seeing the changes to the original code, the module gets entirely replaced with the other one that was found instead.

    It’s true that the __name__ of a function might not match the variable name passed to importlib.reload - but this happens because the function was aliased locally. The original __name__ value should, clearly, be used - it’s not as if anyone is in the habit of reassigning that (although they can, and should bear the consequences).

    It’s also true - as I pointed out earlier - that not everything has a __name__, and that import syntax allows for “importing” any arbitrary attribute from a module, which might have any arbitrary type. However, I think catching the resulting AttributeError and converting it to an ImportError ought to be enough for these circumstances. “You can’t always get a meaningful result” isn’t a reason for not, pardon the pun, trying to implement some functionality.

    However, there is another complication here. As I said, the import syntax allows for “importing” any arbitrary attribute from a module, which might have any arbitrary type. Including, you know, module. Which is how importing a module from a package works: packages are modules, and a module in a package is an attribute of that package.

    That would cause an ambiguity, or at least an inconsistency, with the proposal. Suppose we previously did from foo import bar, and then attempt importlib.reload(bar). If we first check whether bar is a module (like with the current code), we would simply re-load the bar module directly (and reassign it as an attribute of the foo package). However, if bar isn’t a module, we would necessarily have to reload foo; and some might therefore expect foo to be reloaded even if bar is a module.

    “Explicit is better than implicit”, and “special cases aren’t special enough to break the rules”. it makes more sense to have code that’s clear and consistent about what needs to be imported.

    Regarding the original example:

    Nicolas Tessore:

    I think it would be nice if that was just myfunction = importlib.reload(myfunction).

    In fact, we almost have it already: myfunction = importlib.reload(mymodule).myfunction. I think that’s probably the best option here: it’s clear what’s going on, and it avoids using an extra import statement after the code has already been imported, simply to bind a name.

    It does repeat the myfunction name still, but that’s a separate proposal

    Karl Knechtel:

    That can, in principle, be defeated: create and import a module; then manipulate sys.path such that a different .py file with the same name will be found first; then modify the original module and attempt to reload it. Instead of seeing the changes to the original code, the module gets entirely replaced with the other one that was found instead.

    Aside: I don’t consider this to be “defeating” it. It’s the correct behaviour of importing the name.

    >>> import random
    Oops I shadowed random.py
    >>> import importlib, os
    >>> os.unlink("random.py")
    >>> importlib.reload(random)
    <module 'random' from '/usr/local/lib/python3.12/random.py'>
    

    A feature of “reload this function” would need to be aware of func.__wrapped__ to be able to properly cope with decorated functions, and would have a huge number of assumptions (for example, random.randrange is actually a bound method from the Random object, and reloading it has to assume that the name has been maintained, which is usually the case). I’m dubious as to how useful it would be though, because of this problem:

    from random import randrange, sample
    assert randrange.__self__ is sample.__self__ # or any other proof that they're from the same module
    sample = importlib.reload(random).sample
    print(randrange.__self__ is sample.__self__)
    

    So unless you ONLY imported a single name from the module, it could be very very confusing, since some names will (presumably) still be from the old module.

    Karl Knechtel:

    In fact, we almost have it already: myfunction = importlib.reload(mymodule).myfunction. I think that’s probably the best option here: it’s clear what’s going on, and it avoids using an extra import statement after the code has already been imported, simply to bind a name.

    You might also need to import mymodule first. Moving that extra (tiny) bit of typing into reload() is all that I am proposing.

    PS: And you might need to inspect __module__ yourself if myfunction was imported into mymodule in the first place.

    Chris Angelico:

    So unless you ONLY imported a single name from the module, it could be very very confusing, since some names will (presumably) still be from the old module.

    That’s a criticism of using reload() generally.

    Rosuav:

    So unless you ONLY imported a single name from the module, it could be very very confusing, since some names will (presumably) still be from the old module.

    That’s a criticism of using reload() generally.

    True, but if you’ve only ever used import modulename, they’ll all update simultaneously. So this would be another thing to keep track of.

    I’ll be honest, though: I have literally NEVER used importlib.reload in any useful way. When I want hot reloading capabilities, I usually build my own, not using the import system at all.

    kknechtel:

    In fact, we almost have it already: myfunction = importlib.reload(mymodule).myfunction. I think that’s probably the best option here: it’s clear what’s going on, and it avoids using an extra import statement after the code has already been imported, simply to bind a name.

    You might also need to import mymodule first. Moving that extra (tiny) bit of typing into reload() is all that I am proposing.

    And you have to import importlib too. :wink:

    I understand @steven.daprano mentioned that reload(function) has to reload function.__module__ first and find the function which has __name__. But the function.__name__ is fragile (e.g. when decorated, or intentionally renamed) and you cannot always deduce the function object from the name.

    For my use case, I often use the following method:

    require('mymodule'); from mymodule import myfunction
    

    where

    def require(name):
        from importlib import import_module, reload
        if name in sys.modules:
            return reload(sys.modules[name])
        return import_module(name)