Python Str Referenced Before Assignment Python

General Questions¶

Is there a source code level debugger with breakpoints, single-stepping, etc.?¶

Yes.

The pdb module is a simple but adequate console-mode debugger for Python. It is part of the standard Python library, and is . You can also write your own debugger by using the code for pdb as an example.

The IDLE interactive development environment, which is part of the standard Python distribution (normally available as Tools/scripts/idle), includes a graphical debugger.

PythonWin is a Python IDE that includes a GUI debugger based on pdb. The Pythonwin debugger colors breakpoints and has quite a few cool features such as debugging non-Pythonwin programs. Pythonwin is available as part of the Python for Windows Extensions project and as a part of the ActivePython distribution (see https://www.activestate.com/activepython).

Boa Constructor is an IDE and GUI builder that uses wxWidgets. It offers visual frame creation and manipulation, an object inspector, many views on the source like object browsers, inheritance hierarchies, doc string generated html documentation, an advanced debugger, integrated help, and Zope support.

Eric is an IDE built on PyQt and the Scintilla editing component.

Pydb is a version of the standard Python debugger pdb, modified for use with DDD (Data Display Debugger), a popular graphical debugger front end. Pydb can be found at http://bashdb.sourceforge.net/pydb/ and DDD can be found at https://www.gnu.org/software/ddd.

There are a number of commercial Python IDEs that include graphical debuggers. They include:

How can I create a stand-alone binary from a Python script?¶

You don’t need the ability to compile Python to C code if all you want is a stand-alone program that users can download and run without having to install the Python distribution first. There are a number of tools that determine the set of modules required by a program and bind these modules together with a Python binary to produce a single executable.

One is to use the freeze tool, which is included in the Python source tree as . It converts Python byte code to C arrays; a C compiler you can embed all your modules into a new program, which is then linked with the standard Python modules.

It works by scanning your source recursively for import statements (in both forms) and looking for the modules in the standard Python path as well as in the source directory (for built-in modules). It then turns the bytecode for modules written in Python into C code (array initializers that can be turned into code objects using the marshal module) and creates a custom-made config file that only contains those built-in modules which are actually used in the program. It then compiles the generated C code and links it with the rest of the Python interpreter to form a self-contained binary which acts exactly like your script.

Obviously, freeze requires a C compiler. There are several other utilities which don’t. One is Thomas Heller’s py2exe (Windows only) at

http://www.py2exe.org/

Another tool is Anthony Tuininga’s cx_Freeze.

Core Language¶

Why am I getting an UnboundLocalError when the variable has a value?¶

It can be a surprise to get the UnboundLocalError in previously working code when it is modified by adding an assignment statement somewhere in the body of a function.

This code:

works, but this code:

results in an UnboundLocalError:

This is because when you make an assignment to a variable in a scope, that variable becomes local to that scope and shadows any similarly named variable in the outer scope. Since the last statement in foo assigns a new value to , the compiler recognizes it as a local variable. Consequently when the earlier attempts to print the uninitialized local variable and an error results.

In the example above you can access the outer scope variable by declaring it global:

This explicit declaration is required in order to remind you that (unlike the superficially analogous situation with class and instance variables) you are actually modifying the value of the variable in the outer scope:

You can do a similar thing in a nested scope using the keyword:

>>> x=10>>> defbar():... print(x)>>> bar()10
>>> x=10>>> deffoo():... print(x)... x+=1
>>> foo()Traceback (most recent call last):...UnboundLocalError: local variable 'x' referenced before assignment
>>> x=10>>> deffoobar():... globalx... print(x)... x+=1>>> foobar()10
>>> deffoo():... x=10... defbar():... nonlocalx... print(x)... x+=1... bar()... print(x)>>> foo()1011

What are the rules for local and global variables in Python?¶

In Python, variables that are only referenced inside a function are implicitly global. If a variable is assigned a value anywhere within the function’s body, it’s assumed to be a local unless explicitly declared as global.

Though a bit surprising at first, a moment’s consideration explains this. On one hand, requiring for assigned variables provides a bar against unintended side-effects. On the other hand, if was required for all global references, you’d be using all the time. You’d have to declare as global every reference to a built-in function or to a component of an imported module. This clutter would defeat the usefulness of the declaration for identifying side-effects.

Why do lambdas defined in a loop with different values all return the same result?¶

Assume you use a for loop to define a few different lambdas (or even plain functions), e.g.:

This gives you a list that contains 5 lambdas that calculate . You might expect that, when called, they would return, respectively, , , , , and . However, when you actually try you will see that they all return :

This happens because is not local to the lambdas, but is defined in the outer scope, and it is accessed when the lambda is called — not when it is defined. At the end of the loop, the value of is , so all the functions now return , i.e. . You can also verify this by changing the value of and see how the results of the lambdas change:

In order to avoid this, you need to save the values in variables local to the lambdas, so that they don’t rely on the value of the global :

Here, creates a new variable local to the lambda and computed when the lambda is defined so that it has the same value that had at that point in the loop. This means that the value of will be in the first lambda, in the second, in the third, and so on. Therefore each lambda will now return the correct result:

Note that this behaviour is not peculiar to lambdas, but applies to regular functions too.

>>> squares=[]>>> forxinrange(5):... squares.append(lambda:x**2)
>>> squares[2]()16>>> squares[4]()16
>>> x=8>>> squares[2]()64
>>> squares=[]>>> forxinrange(5):... squares.append(lambdan=x:n**2)
>>> squares[2]()4>>> squares[4]()16

How do I share global variables across modules?¶

The canonical way to share information across modules within a single program is to create a special module (often called config or cfg). Just import the config module in all modules of your application; the module then becomes available as a global name. Because there is only one instance of each module, any changes made to the module object get reflected everywhere. For example:

config.py:

mod.py:

main.py:

Note that using a module is also the basis for implementing the Singleton design pattern, for the same reason.

x=0# Default value of the 'x' configuration setting
importconfigconfig.x=1
importconfigimportmodprint(config.x)

What are the “best practices” for using import in a module?¶

In general, don’t use . Doing so clutters the importer’s namespace, and makes it much harder for linters to detect undefined names.

Import modules at the top of a file. Doing so makes it clear what other modules your code requires and avoids questions of whether the module name is in scope. Using one import per line makes it easy to add and delete module imports, but using multiple imports per line uses less screen space.

It’s good practice if you import modules in the following order:

  1. standard library modules – e.g. , , ,
  2. third-party library modules (anything installed in Python’s site-packages directory) – e.g. mx.DateTime, ZODB, PIL.Image, etc.
  3. locally-developed modules

It is sometimes necessary to move imports to a function or class to avoid problems with circular imports. Gordon McMillan says:

Circular imports are fine where both modules use the “import <module>” form of import. They fail when the 2nd module wants to grab a name out of the first (“from module import name”) and the import is at the top level. That’s because names in the 1st are not yet available, because the first module is busy importing the 2nd.

In this case, if the second module is only used in one function, then the import can easily be moved into that function. By the time the import is called, the first module will have finished initializing, and the second module can do its import.

It may also be necessary to move imports out of the top level of code if some of the modules are platform-specific. In that case, it may not even be possible to import all of the modules at the top of the file. In this case, importing the correct modules in the corresponding platform-specific code is a good option.

Only move imports into a local scope, such as inside a function definition, if it’s necessary to solve a problem such as avoiding a circular import or are trying to reduce the initialization time of a module. This technique is especially helpful if many of the imports are unnecessary depending on how the program executes. You may also want to move imports into a function if the modules are only ever used in that function. Note that loading a module the first time may be expensive because of the one time initialization of the module, but loading a module multiple times is virtually free, costing only a couple of dictionary lookups. Even if the module name has gone out of scope, the module is probably available in .

Why are default values shared between objects?¶

This type of bug commonly bites neophyte programmers. Consider this function:

The first time you call this function, contains a single item. The second time, contains two items because when begins executing, starts out with an item already in it.

It is often expected that a function call creates new objects for default values. This is not what happens. Default values are created exactly once, when the function is defined. If that object is changed, like the dictionary in this example, subsequent calls to the function will refer to this changed object.

By definition, immutable objects such as numbers, strings, tuples, and , are safe from change. Changes to mutable objects such as dictionaries, lists, and class instances can lead to confusion.

Because of this feature, it is good programming practice to not use mutable objects as default values. Instead, use as the default value and inside the function, check if the parameter is and create a new list/dictionary/whatever if it is. For example, don’t write:

but:

This feature can be useful. When you have a function that’s time-consuming to compute, a common technique is to cache the parameters and the resulting value of each call to the function, and return the cached value if the same value is requested again. This is called “memoizing”, and can be implemented like this:

You could use a global variable containing a dictionary instead of the default value; it’s a matter of taste.

deffoo(mydict={}):# Danger: shared reference to one dict for all calls...computesomething...mydict[key]=valuereturnmydict
deffoo(mydict=None):ifmydictisNone:mydict={}# create a new dict for local namespace
# Callers will never provide a third parameter for this function.defexpensive(arg1,arg2,_cache={}):if(arg1,arg2)in_cache:return_cache[(arg1,arg2)]# Calculate the valueresult=...expensivecomputation..._cache[(arg1,arg2)]=result# Store result in the cachereturnresult

How can I pass optional or keyword parameters from one function to another?¶

Collect the arguments using the and specifiers in the function’s parameter list; this gives you the positional arguments as a tuple and the keyword arguments as a dictionary. You can then pass these arguments when calling another function by using and :

deff(x,*args,**kwargs):...kwargs['width']='14.3c'...g(x,*args,**kwargs)

What is the difference between arguments and parameters?¶

Parameters are defined by the names that appear in a function definition, whereas arguments are the values actually passed to a function when calling it. Parameters define what types of arguments a function can accept. For example, given the function definition:

foo, bar and kwargs are parameters of . However, when calling , for example:

the values , , and are arguments.

deffunc(foo,bar=None,**kwargs):pass
func(42,bar=314,extra=somevar)

Why did changing list ‘y’ also change list ‘x’?¶

If you wrote code like:

you might be wondering why appending an element to changed too.

There are two factors that produce this result:

  1. Variables are simply names that refer to objects. Doing doesn’t create a copy of the list – it creates a new variable that refers to the same object refers to. This means that there is only one object (the list), and both and refer to it.
  2. Lists are mutable, which means that you can change their content.

After the call to , the content of the mutable object has changed from to . Since both the variables refer to the same object, using either name accesses the modified value .

If we instead assign an immutable object to :

we can see that in this case and are not equal anymore. This is because integers are immutable, and when we do we are not mutating the int by incrementing its value; instead, we are creating a new object (the int ) and assigning it to (that is, changing which object refers to). After this assignment we have two objects (the ints and ) and two variables that refer to them ( now refers to but still refers to ).

Some operations (for example and ) mutate the object, whereas superficially similar operations (for example and ) create a new object. In general in Python (and in all cases in the standard library) a method that mutates an object will return to help avoid getting the two types of operations confused. So if you mistakenly write thinking it will give you a sorted copy of , you’ll instead end up with , which will likely cause your program to generate an easily diagnosed error.

However, there is one class of operations where the same operation sometimes has different behaviors with different types: the augmented assignment operators. For example, mutates lists but not tuples or ints ( is equivalent to and mutates , whereas and create new objects).

In other words:

  • If we have a mutable object (, , , etc.), we can use some specific operations to mutate it and all the variables that refer to it will see the change.
  • If we have an immutable object (, , , etc.), all the variables that refer to it will always see the same value, but operations that transform that value into a new value always return a new object.

If you want to know if two variables refer to the same object or not, you can use the operator, or the built-in function .

>>> x=[]>>> y=x>>> y.append(10)>>> y[10]>>> x[10]
>>> x=5# ints are immutable>>> y=x>>> x=x+1# 5 can't be mutated, we are creating a new object here>>> x6>>> y5

How do I write a function with output parameters (call by reference)?¶

Remember that arguments are passed by assignment in Python. Since assignment just creates references to objects, there’s no alias between an argument name in the caller and callee, and so no call-by-reference per se. You can achieve the desired effect in a number of ways.

  1. By returning a tuple of the results:

    This is almost always the clearest solution.

    deffunc2(a,b):a='new-value'# a and b are local namesb=b+1# assigned to new objectsreturna,b# return new valuesx,y='old-value',99x,y=func2(x,y)print(x,y)# output: new-value 100
  2. By using global variables. This isn’t thread-safe, and is not recommended.

  3. By passing a mutable (changeable in-place) object:

    deffunc1(a):a[0]='new-value'# 'a' references a mutable lista[1]=a[1]+1# changes a shared objectargs=['old-value',99]func1(args)print(args[0],args[1])# output: new-value 100
  4. By passing in a dictionary that gets mutated:

    deffunc3(args):args['a']='new-value'# args is a mutable dictionaryargs['b']=args['b']+1# change it in-placeargs={'a':'old-value','b':99}func3(args)print(args['a'],args['b'])
  5. Or bundle up values in a class instance:

    There’s almost never a good reason to get this complicated.

    classcallByRef:def__init__(self,**args):for(key,value)inargs.items():setattr(self,key,value)deffunc4(args):args.a='new-value'# args is a mutable callByRefargs.b=args.b+1# change object in-placeargs=callByRef(a='old-value',b=99)func4(args)print(args.a,args.b)

Your best choice is to return a tuple containing the multiple results.

How do you make a higher order function in Python?¶

You have two choices: you can use nested scopes or you can use callable objects. For example, suppose you wanted to define which returns a function that computes the value . Using nested scopes:

Or using a callable object:

In both cases,

gives a callable object where .

The callable object approach has the disadvantage that it is a bit slower and results in slightly longer code. However, note that a collection of callables can share their signature via inheritance:

Object can encapsulate state for several methods:

Here , and act like functions which share the same counting variable.

deflinear(a,b):defresult(x):returna*x+breturnresult
classlinear:def__init__(self,a,b):self.a,self.b=a,bdef__call__(self,x):returnself.a*x+self.b
classexponential(linear):# __init__ inheriteddef__call__(self,x):returnself.a*(x**self.b)
classcounter:value=0defset(self,x):self.value=xdefup(self):self.value=self.value+1defdown(self):self.value=self.value-1count=counter()inc,dec,reset=count.up,count.down,count.set

How can my code discover the name of an object?¶

Generally speaking, it can’t, because objects don’t really have names. Essentially, assignment always binds a name to a value; The same is true of and statements, but in that case the value is a callable. Consider the following code:

Arguably the class has a name: even though it is bound to two names and invoked through the name B the created instance is still reported as an instance of class A. However, it is impossible to say whether the instance’s name is a or b, since both names are bound to the same value.

Generally speaking it should not be necessary for your code to “know the names” of particular values. Unless you are deliberately writing introspective programs, this is usually an indication that a change of approach might be beneficial.

In comp.lang.python, Fredrik Lundh once gave an excellent analogy in answer to this question:

The same way as you get the name of that cat you found on your porch: the cat (object) itself cannot tell you its name, and it doesn’t really care – so the only way to find out what it’s called is to ask all your neighbours (namespaces) if it’s their cat (object)…

….and don’t be surprised if you’ll find that it’s known by many names, or no name at all!

>>> classA:... pass...>>> B=A>>> a=B()>>> b=a>>> print(b)<__main__.A object at 0x16D07CC>>>> print(a)<__main__.A object at 0x16D07CC>

What’s up with the comma operator’s precedence?¶

Comma is not an operator in Python. Consider this session:

Since the comma is not an operator, but a separator between expressions the above is evaluated as if you had entered:

not:

The same is true of the various assignment operators (, etc). They are not truly operators but syntactic delimiters in assignment statements.

>>> "a"in"b","a"(False, 'a')
("a"in"b"),"a"
"a"in("b","a")

Is there an equivalent of C’s “?:” ternary operator?¶

Yes, there is. The syntax is as follows:

Before this syntax was introduced in Python 2.5, a common idiom was to use logical operators:

However, this idiom is unsafe, as it can give wrong results when on_true has a false boolean value. Therefore, it is always better to use the form.

[on_true]if[expression]else[on_false]x,y=50,25small=xifx<yelsey
[expression]and[on_true]or[on_false]

Is it possible to write obfuscated one-liners in Python?¶

Yes. Usually this is done by nesting within . See the following three examples, due to Ulf Bartelt:

Don’t try this at home, kids!

fromfunctoolsimportreduce# Primes < 1000print(list(filter(None,map(lambday:y*reduce(lambdax,y:x*y!=0,map(lambdax,y=y:y%x,range(2,int(pow(y,0.5)+1))),1),range(2,1000)))))# First 10 Fibonacci numbersprint(list(map(lambdax,f=lambdax,f:(f(x-1,f)+f(x-2,f))ifx>1else1:f(x,f),range(10))))# Mandelbrot setprint((lambdaRu,Ro,Iu,Io,IM,Sx,Sy:reduce(lambdax,y:x+y,map(lambday,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,Sy=Sy,L=lambdayc,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,i=IM,Sx=Sx,Sy=Sy:reduce(lambdax,y:x+y,map(lambdax,xc=Ru,yc=yc,Ru=Ru,Ro=Ro,i=i,Sx=Sx,F=lambdaxc,yc,x,y,k,f=lambdaxc,yc,x,y,k,f:(k<=0)or(x*x+y*y>=4.0)or1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):f(xc,yc,x,y,k,f):chr(64+F(Ru+x*(Ro-Ru)/Sx,yc,0,0,i)),range(Sx))):L(Iu+y*(Io-Iu)/Sy),range(Sy))))(-2.1,0.7,-1.2,1.2,30,80,24))# \___ ___/ \___ ___/ | | |__ lines on screen# V V | |______ columns on screen# | | |__________ maximum of "iterations"# | |_________________ range on y axis# |____________________________ range on x axis

Numbers and strings¶

How do I specify hexadecimal and octal integers?¶

To specify an octal digit, precede the octal value with a zero, and then a lower or uppercase “o”. For example, to set the variable “a” to the octal value “10” (8 in decimal), type:

Hexadecimal is just as easy. Simply precede the hexadecimal number with a zero, and then a lower or uppercase “x”. Hexadecimal digits can be specified in lower or uppercase. For example, in the Python interpreter:

>>> a=0o10>>> a8
>>> a=0xa5>>> a165>>> b=0XB2>>> b178

Why does -22 // 10 return -3?¶

It’s primarily driven by the desire that have the same sign as . If you want that, and also want:

then integer division has to return the floor. C also requires that identity to hold, and then compilers that truncate need to make have the same sign as .

There are few real use cases for when is negative. When is positive, there are many, and in virtually all of them it’s more useful for to be . If the clock says 10 now, what did it say 200 hours ago? is useful; is a bug waiting to bite.

i==(i//j)*j+(i%j)

How do I convert a string to a number?¶

For integers, use the built-in type constructor, e.g. . Similarly, converts to floating-point, e.g. .

By default, these interpret the number as decimal, so that and raises . takes the base to convert from as a second optional argument, so . If the base is specified as 0, the number is interpreted using Python’s rules: a leading ‘0o’ indicates octal, and ‘0x’ indicates a hex number.

Do not use the built-in function if all you need is to convert strings to numbers. will be significantly slower and it presents a security risk: someone could pass you a Python expression that might have unwanted side effects. For example, someone could pass which would erase your home directory.

also has the effect of interpreting numbers as Python expressions, so that e.g. gives a syntax error because Python does not allow leading ‘0’ in a decimal number (except ‘0’).

How do I modify a string in place?¶

You can’t, because strings are immutable. In most situations, you should simply construct a new string from the various parts you want to assemble it from. However, if you need an object with the ability to modify in-place unicode data, try using an object or the module:

>>> importio>>> s="Hello, world">>> sio=io.StringIO(s)>>> sio.getvalue()'Hello, world'>>> sio.seek(7)7>>> sio.write("there!")6>>> sio.getvalue

If you're closely following the Python tag on StackOverflow, you'll notice that the same question comes up at least once a week. The question goes on like this:

x = 10deffoo(): x += 1print x foo()

Why, when run, this results in the following error:

Traceback (most recent call last): File "unboundlocalerror.py", line 8, in <module> foo() File "unboundlocalerror.py", line 4, in foo x += 1 UnboundLocalError: local variable 'x' referenced before assignment

There are a few variations on this question, with the same core hiding underneath. Here's one:

lst = [1, 2, 3] deffoo(): lst.append(5) # OK#lst += [5] # ERROR here foo() print lst

Running the statement successfully appends 5 to the list. However, substitute it for , and it raises , although at first sight it should accomplish the same.

Although this exact question is answered in Python's official FAQ (right here), I decided to write this article with the intent of giving a deeper explanation. It will start with a basic FAQ-level answer, which should satisfy one only wanting to know how to "solve the damn problem and move on". Then, I will dive deeper, looking at the formal definition of Python to understand what's going on. Finally, I'll take a look what happens behind the scenes in the implementation of CPython to cause this behavior.

The simple answer

As mentioned above, this problem is covered in the Python FAQ. For completeness, I want to explain it here as well, quoting the FAQ when necessary.

Let's take the first code snippet again:

x = 10deffoo(): x += 1print x foo()

So where does the exception come from? Quoting the FAQ:

This is because when you make an assignment to a variable in a scope, that variable becomes local to that scope and shadows any similarly named variable in the outer scope.

But is similar to , so it should first read , perform the addition and then assign back to . As mentioned in the quote above, Python considers a variable local to , so we have a problem - a variable is read (referenced) before it's been assigned. Python raises the exception in this case [1].

So what do we do about this? The solution is very simple - Python has the global statement just for this purpose:

x = 10deffoo(): global x x += 1print x foo()

This prints , without any errors. The statement tells Python that inside , refers to the global variable , even if it's assigned in .

Actually, there is another variation on the question, for which the answer is a bit different. Consider this code:

defexternal(): x = 10definternal(): x += 1print(x) internal() external()

This kind of code may come up if you're into closures and other techniques that use Python's lexical scoping rules. The error this generates is the familiar . However, applying the "global fix":

defexternal(): x = 10definternal(): global x x += 1print(x) internal() external()

Doesn't help - another error is generated: . Python is right here - after all, there's no global variable named , there's only an in . It may be not local to , but it's not global. So what can you do in this situation? If you're using Python 3, you have the keyword. Replacing by in the last snippet makes everything work as expected. is a new statement in Python 3, and there is no equivalent in Python 2 [2].

The formal answer

Assignments in Python are used to bind names to values and to modify attributes or items of mutable objects. I could find two places in the Python (2.x) documentation where it's defined how an assignment to a local variable works.

One is section 6.2 "Assignment statements" in the Simple Statements chapter of the language reference:

Assignment of an object to a single target is recursively defined as follows. If the target is an identifier (name):

  • If the name does not occur in a global statement in the current code block: the name is bound to the object in the current local namespace.
  • Otherwise: the name is bound to the object in the current global namespace.

Another is section 4.1 "Naming and binding" of the Execution model chapter:

If a name is bound in a block, it is a local variable of that block.

[...]

When a name is used in a code block, it is resolved using the nearest enclosing scope. [...] If the name refers to a local variable that has not been bound, a UnboundLocalError exception is raised.

This is all clear, but still, another small doubt remains. All these rules apply to assignments of the form which clearly bind to . But the code snippets we're having a problem with here have the assignment. Shouldn't that just modify the bound value, without re-binding it?

Well, no. and its cousins (, , etc.) are what Python calls "augmented assignment statements" [emphasis mine]:

An augmented assignment evaluates the target (which, unlike normal assignment statements, cannot be an unpacking) and the expression list, performs the binary operation specific to the type of assignment on the two operands, and assigns the result to the original target. The target is only evaluated once.

An augmented assignment expression like can be rewritten as to achieve a similar, but not exactly equal effect. In the augmented version, is only evaluated once. Also, when possible, the actual operation is performed in-place, meaning that rather than creating a new object and assigning that to the target, the old object is modified instead.

With the exception of assigning to tuples and multiple targets in a single statement, the assignment done by augmented assignment statements is handled the same way as normal assignments. Similarly, with the exception of the possible in-place behavior, the binary operation performed by augmented assignment is the same as the normal binary operations.

So when earlier I said that is similar to, I wasn't telling all the truth, but it was accurate with respect to binding. Apart for possible optimization, counts exactly as when binding is considered. If you think carefully about it, it's unavoidable, because some types Python works with are immutable. Consider strings, for example:

x = "abc" x += "def"

The first line binds to the value "abc". The second line doesn't modify the value "abc" to be "abcdef". Strings are immutable in Python. Rather, it creates the new value "abcdef" somewhere in memory, and re-binds to it. This can be seen clearly when examining the object ID for before and after the :

>>> x = "abc" >>> id(x) 11173824 >>> x += "def" >>> id(x) 32831648 >>> x 'abcdef'

Note that some types in Python are mutable. For example, lists can actually be modified in-place:

>>> y = [1, 2] >>> id(y) 32413376 >>> y += [2, 3] >>> id(y) 32413376 >>> y [1, 2, 2, 3]

didn't change after , because the object referenced was just modified. Still, Python re-bound to the same object [3].

The "too much information" answer

This section is of interest only to those curious about the implementation internals of Python itself.

One of the stages in the compilation of Python into bytecode is building the symbol table [4]. An important goal of building the symbol table is for Python to be able to mark the scope of variables it encounters - which variables are local to functions, which are global, which are free (lexically bound) and so on.

When the symbol table code sees a variable is assigned in a function, it marks it as local. Note that it doesn't matter if the assignment was done before usage, after usage, or maybe not actually executed due to a condition in code like this:

x = 10deffoo(): if something_false_at_runtime: x = 20print(x)

We can use the module to examine the symbol table information gathered on some Python code during compilation:

importsymtable code = '''x = 10def foo(): x += 1 print(x)''' table = symtable.symtable(code, '<string>', 'exec') foo_namespace = table.lookup('foo').get_namespace() sym_x = foo_namespace.lookup('x') print(sym_x.get_name()) print(sym_x.is_local())

This prints:

So we see that was marked as local in . Marking variables as local turns out to be important for optimization in the bytecode, since the compiler can generate a special instruction for it that's very fast to execute. There's an excellent article here explaining this topic in depth; I'll just focus on the outcome.

The function in handles variable name references. To generate the correct opcode, it queries the symbol table function . For our , this returns a bitfield with in it. Having seen , generates a . We can see this in the disassembly of :

35 0 LOAD_FAST 0 (x) 3 LOAD_CONST 1 (1) 6 INPLACE_ADD 7 STORE_FAST 0 (x) 36 10 LOAD_GLOBAL 0 (print) 13 LOAD_FAST 0 (x) 16 CALL_FUNCTION 1 19 POP_TOP 20 LOAD_CONST 0 (None) 23 RETURN_VALUE

The first block of instructions shows what was compiled to. You will note that already here (before it's actually assigned), is used to retrieve the value of .

This is the instruction that will cause the exception to be raised at runtime, because it is actually executed before any is done for . The gory details are in the bytecode interpreter code in :

TARGET(LOAD_FAST) x = GETLOCAL(oparg); if (x != NULL) { Py_INCREF(x); PUSH(x); FAST_DISPATCH(); } format_exc_check_arg(PyExc_UnboundLocalError, UNBOUNDLOCAL_ERROR_MSG, PyTuple_GetItem(co->co_varnames, oparg)); break;

Ignoring the macro-fu for the moment, what this basically says is that once is seen, the value of is obtained from an indexed array of objects [5]. If no was done before, this value is still , the branch is not taken [6] and the exception is raised.

You may wonder why Python waits until runtime to raise this exception, instead of detecting it in the compiler. The reason is this code:

x = 10deffoo(): if something_true(): x = 1 x += 1print(x)

Suppose is a function that returns , possibly due to some user input. In this case, binds locally, so the reference to it in is no longer unbound. This code will then run without exceptions. Of course if actually turns out to return , the exception will be raised. Python has no way to resolve this at compile time, so the error detection is postponed to runtime.


0 thoughts on “Python Str Referenced Before Assignment Python

Leave a Reply

Your email address will not be published. Required fields are marked *