【Python笔记】从一个“古怪”的case探究CPython对Int对象的实现

1. Python的对象模型我们知道,在Python的世界里,万物皆对象(Object)。根据Python官方文档对Data Model的说明,每个Python对象均拥有3个特性:身份、类型和值。官方文档关于对象模型的这段概括说明对于我们理解Python对象是如此重要,所以本文将其摘录如下(为了使得结构更清晰,这里把原文档做了分段处理):1) Every object has an identity, a type and a value.2) An object’s identity never changes once it has been created; you may think of it as the object’s addressin memory. The ‘is’ operator compares the identity of two objects; the id() function returns an integerrepresenting its identity (currently implemented as its address).3) An object’s type is also unchangeable. An object’s type determines the operations that the objectsupports (e.g., "does it have a length?") and also defines the possible values for objects of that type. Thetype() function returns an object’s type (which is an object itself).4) The value of some objects can change. Objects whose value can change are said to be mutable; objectswhose value is unchangeable once they are created are called immutable. (The value of an immutable containerobject that contains a reference to a mutable object can change when the latter’s value is changed; howeverthe container is still considered immutable, because the collection of objects it contains cannot bechanged. So, immutability is not strictly the same as having an unchangeable value, it is more subtle.)5) An object’s mutability is determined by its type; for instance, numbers, strings and tuples areimmutable, while dictionaries and lists are mutable.总结一下:1) 每个Python对象均有3个特性:身份、类型和值2) 对象一旦创建,其身份(可以理解为对象的内存地址)就是不可变的。可以借助Python的built-in函数id()来获取对象的id值,可以借助is操作符来比较两个对象是否是同一个对象3) 已创建对象的类型不可更改,对象的类型决定了可以作用于该对象上的操作,也决定了该对象可能支持的值4) 某些对象(如list/dict)的value可以修改,这类对象被称为mutable object;而另一些对象(如numbers/strings/tuples)一旦创建,其value就不可修改,故被称为immutable object5) 对象的值是否可以被修改是由其type决定的

2. 实例说明immutable object的值的“修改”行为由上面的描述可知,数字类型的对象是不可变对象。为加深理解,考虑下面的示例代码。>>> x = 2.11>>> id(x)7223328>>> x += 0.5>>> x2.61>>> id(x)7223376上述代码中,x += 0.5看起来像是修改了名为x的对象的值。但事实上,在Python底层实现中,x只是个指针,它指向对象的引用,也即x并不是一个数字类型的对象。上述代码真正发生的事情是:1) 值为2.11的float类型对象被创建,其引用计数值为12) x作为引用指向了刚才创建的对象,对象的引用计数值变为23) 当执行"x += 0.5"时,值为2.61的float类型对象被创建(其初始引用计数值为1),x作为引用指向了这个新对象(这意味着新对象的引用计数值变为2,而第1个对象的引用计数值由于x的"解引用"而减为1)可见,上述代码并没有修改名为x的对象的值,标识符x只是通过重新引用指向了新创建的对象,让我们误以为其值被“修改”了而已。

3. 一个“古怪”的case按照上述说明,下面的case如何理解呢?

>>> a = 20>>> b = 20>>> id(a)7151888>>> id(b)7151888>>> a is bTrue上述代码中,a和b应该是不同的对象的引用,它们的id值不相等才对。但id(a) == id(b)及"a is b"输出"True"的事实表明,CPython解释器显然不是按照我们的预期来执行的。难道是解释器实现有bug吗?4. 从CPython实现PyIntObject的源码来揭秘事实上,上面看到的不符合预期的古怪case与CPython实现PyIntObject类型时所作的优化有关。《Python核心编程》一书第4.5.2节提到:整数对象是不可变对象,所以Python会高效的缓存它,而这会造成我们认为Python应该创建新对象时,,它却没有创建新对象的假象。这正是我们刚才遇到的“古怪”case的底层原因。为了证实这一点,我查看了CPython v2.7开源在github上的源码(cpython/Objects/intobject.c),可以看到下面一段代码:#ifndef NSMALLPOSINTS#define NSMALLPOSINTS 257#endif#ifndef NSMALLNEGINTS#define NSMALLNEGINTS 5#endif#if NSMALLNEGINTS + NSMALLPOSINTS > 0/* References to small integers are saved in this array so that they can be shared. The integers that are saved are those in the range -NSMALLNEGINTS (inclusive) to NSMALLPOSINTS (not inclusive).*/static PyIntObject *small_ints[NSMALLNEGINTS + NSMALLPOSINTS];#endif可见,解释器实现int型对象时,确实申请了一个small_ints数组用于缓存小整数,从宏定义及注释可以看到,缓存的整数范围是[-5, 257)。在该源码文件中搜索"small_ints"还可以看到,该数组被4个函数用到,函数名分别为:_PyInt_Init,PyInt_FromLong,PyInt_ClearFreeList, PyInt_Fini其中,后两个函数与资源释放相关,我们此处不关心;而在_PyInt_Init中,构造一系列small int对象并存入small_ints数组;在PyObject * PyInt_FromLong(long ival)函数中,若构造的是个small int(即传入的ival在small int范围内),则直接返回small_ints数组中的缓存对象,若传入的ival不在small int访问内,则创建新对象并返回其引用。像一颗深绿色的宝石镶嵌在云南大地上,

【Python笔记】从一个“古怪”的case探究CPython对Int对象的实现

相关文章:

你感兴趣的文章:

标签云: