Lua 笔记 string/table/function知识点

本文涉及到一些关于string、table和function的细碎知识点，一些常用操作的背后逻辑。本文是用于记录一次技术分享，部分内容与之前的一篇Lua的笔记有重叠。

本文参考和使用的lua源码基于Lua 5.3.5，编写的lua脚本运行于OSX系统，使用的是64位的lua运行库。

评估方法

纯lua侧的对于代码执行耗时的评估和执行过程中产生的堆内存的分析。

时间分析

对应于cpu负载，借助os.clock()函数，以下是一个示例：

local a, b
a = os.clock()
-- do something
b = os.clock()
print(b-a)

空间分析

对应于内存占用，借助collectgarbage()函数，这个函数传入不同的参数可以对lua的gc机制进行不同的操作控制，具体不再展开，这里主要是对堆内存的占用进行评估，用到的是以下的一套组合三连，即先强制一轮完整的gc（collect），然后禁用gc（stop），在执行完一些待测试的代码之后获取对内存占用的千字节数（count）：

collectgarbage("collect")
collectgarbage("stop")
-- do something
print(collectgarbage("count"))

来一个简单的gc演示，三次获取堆内存，分配前、分配后、gc后：

collectgarbage("collect")
collectgarbage("stop")
local a = collectgarbage("count") * 1024

local t = {}
t = nil

local b = collectgarbage("count") * 1024

collectgarbage("collect")

local c = collectgarbage("count") * 1024

print("before alloc " .. a)
print("after alloc " .. b)
print("after collect " .. c)

上边输出的内容，a和c是相等的，b会多出来56个字节（这是一个空表的内存占用）。

通用的lua类型

值类型

Lua中通用的值类型，TValue，使用一个联合体保存数据，和一个枚举值区分该值的类型，定义在lobject.h第100行：

/*
** Union of all Lua values
*/
typedef union Value {
  GCObject *gc;    /* collectable objects */
  void *p;         /* light userdata */
  int b;           /* booleans */
  lua_CFunction f; /* light C functions */
  lua_Integer i;   /* integer numbers */
  lua_Number n;    /* float numbers */
} Value;


#define TValuefields	Value value_; int tt_


typedef struct lua_TValue {
  TValuefields;
} TValue;

GC信息

gc相关的通用数据，table、string和function都是受gc管理的类型，它们的结构中都是以一个CommonHeader开始，其定义如下：

/*
** Common type for all collectable objects
*/
typedef struct GCObject GCObject;


/*
** Common Header for all collectable objects (in macro form, to be
** included in other objects)
*/
#define CommonHeader  GCObject *next; lu_byte tt; lu_byte marked


/*
** Common type has only the common header
*/
struct GCObject {
  CommonHeader;
};

string

先从一段简单短Lua代码开始，尝试得到一个空字符串""所占用的堆内存：

collectgarbage("collect")
collectgarbage("stop")

-- -------------------------------------------------------

local before = collectgarbage("count")

local s = table.concat({})
collectgarbage("collect")

-- -------------------------------------------------------

local after = collectgarbage("count")
print(1024 * (after - before))

输出25，即一个空字符串会产生25字节的堆内存。此处之所以使用table.concat({})而不是直接使用""，是因为代码中直接存在的字面字符串不会再单独分配堆内存。

来看一看一个空字符串的25个字节分别是什么内容，并以此为基础计算任意一个字符串占用的内存。

结构和定义

字符串的类型的定义在lobject.h第303行。

/*
** Header for string value; string bytes follow the end of this structure
** (aligned according to 'UTString'; see next).
*/
typedef struct TString {
  CommonHeader;
  lu_byte extra;  /* reserved words for short strings; "has hash" for longs */
  lu_byte shrlen;  /* length for short strings */
  unsigned int hash;
  union {
    size_t lnglen;  /* length for long strings */
    struct TString *hnext;  /* linked list for hash table */
  } u;
} TString;

为确保内存对齐使用结构体UTString又套了一层：

/*
** Ensures that address after this type is always fully aligned.
*/
typedef union UTString {
  L_Umaxalign dummy;  /* ensures maximum alignment for strings */
  TString tsv;
} UTString;

在lstring.c中有创建字符串的函数：

/*
** creates a new string object
*/
static TString *createstrobj (lua_State *L, size_t l, int tag, unsigned int h) {
  TString *ts;
  GCObject *o;
  size_t totalsize;  /* total size of TString object */
  totalsize = sizelstring(l);
  o = luaC_newobj(L, tag, totalsize);
  ts = gco2ts(o);
  ts->hash = h;
  ts->extra = 0;
  getstr(ts)[l] = '\0';  /* ending 0 */
  return ts;
}

其中包含一个宏sizelstring，定义于lstring.h，拆开是：

1	#define sizelstring(l) (sizeof(union UTString) + ((l) + 1) * sizeof(char))

也就是说，要创建一个长度为l的字符串，实际上会分配sizelstring(l)的内存。结合前边UTString和CommonHeader的结构定义，不难得出UTString占用24字节。所以对于一个空字符串，加上1个char，一共是25字节。多出来的一个char是为了分配\0。

可以得到如下结论：Lua中的string由两部分组成，字符串头和字符数组。字符串头即是前边的UString，字符数组就是传统的c中的数组，最终需要有\0作为结尾。

lobject.h中还定义了获取string的字符数组部分的宏：

/*
** Get the actual string (array of bytes) from a 'TString'.
** (Access to 'extra' ensures that value is really a 'TString'.)
*/
#define getstr(ts)  \
  check_exp(sizeof((ts)->extra), cast(char *, (ts)) + sizeof(UTString))


/* get the actual string (array of bytes) from a Lua value */
#define svalue(o)       getstr(tsvalue(o))

短字符串和长字符串

执行下边的代码：

collectgarbage("collect")
collectgarbage("stop")

-- -------------------------------------------------------
local before = collectgarbage("count")

local s1 = "1234567890123456789012345678901234567890" -- 长度 40
-- local s1 = "123456789012345678901234567890123456789" -- 长度 39

for i = 1, 9 do
    local s = s1 .. " "
end

-- -------------------------------------------------------

local after = collectgarbage("count")
print(1024 * (after - before))

发现当s1长度为39或者40时，产生的堆内存会有巨大的差异（适用于Lua 5.3.5 未必适用于更早版本的lua）。

原因是在Lua中有长短字符串之分，字符串有两种子类型：

lobject.h 52

1
2
3

/* Variant tags for strings */
#define LUA_TSHRSTR (LUA_TSTRING | (0 << 4))  /* short strings */
#define LUA_TLNGSTR (LUA_TSTRING | (1 << 4))  /* long strings */

短字符串的最大长度为40，定义于llimits.h：

/*
** Maximum length for short strings, that is, strings that are
** internalized. (Cannot be smaller than reserved words or tags for
** metamethods, as these strings must be internalized;
** #("function") = 8, #("__newindex") = 10.)
*/
#if !defined(LUAI_MAXSHORTLEN)
#define LUAI_MAXSHORTLEN	40
#endif

在上边的示例代码中，分别在循环内部创建相同的字符串。两种情况下创建的分别是短字符串和长字符串。二者的本质区别如下：

短字符串保存在全局的字符串表stringtable中，产生时会立即计算hash；
长字符串单独保存，只有在需要时才lazy计算hash；

以下是创建字符串的逻辑，在lstring.c中：

/*
** new string (with explicit length)
*/
TString *luaS_newlstr (lua_State *L, const char *str, size_t l) {
  if (l <= LUAI_MAXSHORTLEN)  /* short string? */
    return internshrstr(L, str, l);
  else {
    TString *ts;
    if (l >= (MAX_SIZE - sizeof(TString))/sizeof(char))
      luaM_toobig(L);
    ts = luaS_createlngstrobj(L, l);
    memcpy(getstr(ts), str, l * sizeof(char));
    return ts;
  }
}

在创建时会判断是否短字符串，如果是短字符串，那么走internshrstr，否则luaS_createlngstrobj。先来看短字符串的内部化：

/*
** checks whether short string exists and reuses it or creates a new one
*/
static TString *internshrstr (lua_State *L, const char *str, size_t l) {
  TString *ts;
  global_State *g = G(L);
  unsigned int h = luaS_hash(str, l, g->seed);
  TString **list = &g->strt.hash[lmod(h, g->strt.size)];
  lua_assert(str != NULL);  /* otherwise 'memcmp'/'memcpy' are undefined */
  for (ts = *list; ts != NULL; ts = ts->u.hnext) {
    if (l == ts->shrlen &&
        (memcmp(str, getstr(ts), l * sizeof(char)) == 0)) {
      /* found! */
      if (isdead(g, ts))  /* dead (but not collected yet)? */
        changewhite(ts);  /* resurrect it */
      return ts;
    }
  }
  if (g->strt.nuse >= g->strt.size && g->strt.size <= MAX_INT/2) {
    luaS_resize(L, g->strt.size * 2);
    list = &g->strt.hash[lmod(h, g->strt.size)];  /* recompute with new size */
  }
  ts = createstrobj(L, l, LUA_TSHRSTR, h);
  memcpy(getstr(ts), str, l * sizeof(char));
  ts->shrlen = cast_byte(l);
  ts->u.hnext = *list;
  *list = ts;
  g->strt.nuse++;
  return ts;
}

计算hash并到字符串表&g->strt中查找，如果找到了则返回已有的引用，否则createstrobj创建新字符串。

而对于长字符串，直接走创建新字符串的流程：

TString *luaS_createlngstrobj (lua_State *L, size_t l) {
  TString *ts = createstrobj(L, l, LUA_TLNGSTR, G(L)->seed);
  ts->u.lnglen = l;
  return ts;
}

字符串的拼接

对于字符串..拼接和使用table.concat两种方式，都是通用的底层原理，重点在于减少过程中产生的临时字符串，以避免额外的gc开销，详见这里。

使用string.format避免不必要的gc

使用string.format，避免数字产生的短字符串。

collectgarbage("collect")
collectgarbage("stop")

-- -------------------------------------------------------
local before = collectgarbage("count")

local title1 = "attack: "
local title2 = "defence: "

for i = 1, 100 do
    -- local s = string.format("%s%d%s%d",title1,i,title2,i)
    local s = title1 .. i .. title2 .. i
end

-- -------------------------------------------------------

local after = collectgarbage("count")
print(1024 * (after - before))

拼接字符串时，会自动调用tostring，将数字转为字符串，而使用format则可以避免创建这些短字符串。

table

先来一段Lua代码，尝试获取一个空表的堆内存占用：

collectgarbage("collect")
collectgarbage("stop")
local before = collectgarbage("count")
-- -------------------------------------------------------

local a = {}

-- -------------------------------------------------------
local after = collectgarbage("count")
print(1024 * (after - before))

得到的结果是56个字节，一个空表的内存占用。

结构和定义

表的类型定于与lobject.h第497行：

typedef struct Table {
  CommonHeader;
  lu_byte flags;  /* 1<<p means tagmethod(p) is not present */
  lu_byte lsizenode;  /* log2 of size of 'node' array */
  unsigned int sizearray;  /* size of 'array' array */
  TValue *array;  /* array part */
  Node *node;
  Node *lastfree;  /* any free position is before this position */
  struct Table *metatable;
  GCObject *gclist;
} Table;

对于一个空的表，数组部分和字典部分都为空，易知其结构中各成员直接加和得到的总的内存占用。

对于不为空的表，数组部分的数据存储于TValue *array，每个元素作为一个TValue存储；字典部分的数据存储于Node *node。Node的定义如下：

typedef struct Node {
  TValue i_val;
  TKey i_key;
} Node;

typedef union TKey {
  struct {
    TValuefields;
    int next;  /* for chaining (offset for next node) */
  } nk;
  TValue tvk;
} TKey;

对于一个表来说，数组部分的数据，每个数据在表中占用16字节，字典部分的数据，每个数据在表中占用32字节。

获取表的长度

使用#取表的长度，会调用luaH_getn，其定义和涉及到的函数如下，在ltable.c中：

static lua_Unsigned unbound_search (Table *t, lua_Unsigned j) {
  lua_Unsigned i = j;  /* i is zero or a present index */
  j++;
  /* find 'i' and 'j' such that i is present and j is not */
  while (!ttisnil(luaH_getint(t, j))) {
    i = j;
    if (j > l_castS2U(LUA_MAXINTEGER) / 2) {  /* overflow? */
      /* table was built with bad purposes: resort to linear search */
      i = 1;
      while (!ttisnil(luaH_getint(t, i))) i++;
      return i - 1;
    }
    j *= 2;
  }
  /* now do a binary search between them */
  while (j - i > 1) {
    lua_Unsigned m = (i+j)/2;
    if (ttisnil(luaH_getint(t, m))) j = m;
    else i = m;
  }
  return i;
}


/*
** Try to find a boundary in table 't'. A 'boundary' is an integer index
** such that t[i] is non-nil and t[i+1] is nil (and 0 if t[1] is nil).
*/
lua_Unsigned luaH_getn (Table *t) {
  unsigned int j = t->sizearray;
  if (j > 0 && ttisnil(&t->array[j - 1])) {
    /* there is a boundary in the array part: (binary) search for it */
    unsigned int i = 0;
    while (j - i > 1) {
      unsigned int m = (i+j)/2;
      if (ttisnil(&t->array[m - 1])) j = m;
      else i = m;
    }
    return i;
  }
  /* else must find a boundary in hash part */
  else if (isdummy(t))  /* hash part is empty? */
    return j;  /* that is easy... */
  else return unbound_search(t, j);
}

获取表的长度，涉及到数组部分和哈希部分。如果数组部分未满，在数组内二分查找，数组为空或数组满则在哈希部分查找。

在使用#获取数组长度，当数组部分有空洞（nil）时，可能出现与预期不一致的结果：

数组部分满时：

local a = {}

for i = 1,8 do
    table.insert(a, i)
end

-- 数组部分 8/8

for i = 1,7 do
    a[i] = nil
end

print(#a) -- 输出8

数组部分不满时，二分查找可能出现的于预期不符的情况：

local a = {}

for i = 1,7 do
    table.insert(a,i)
end

-- 数组部分 7/8

a[1] = nil
print(#a) -- 输出7

a[6] = nil
print(#a) -- 输出5

插入数据

详见这里，插入数据过程中应避免表长度的获取。

rehash

数组扩容

向数组中插入新的元素时，数组部分会动态扩容，以下是一段动态扩容的演示代码：

collectgarbage("collect")
collectgarbage("stop")

-- -------------------------------------------------------

local a = {}

for i = 1,20 do

    local before = collectgarbage("count")
    table.insert(a,i)
    local after = collectgarbage("count")
    print(i, 1024 * (after - before))

end

将输出：

在每次插入数据时，如果数组部分已满，就将数组部分重新分配，长度是原来的2倍。

字典部分rehash

演示字典部分扩容的代码，字典填满后会扩容并rehash：

collectgarbage("collect")
collectgarbage("stop")

-- -------------------------------------------------------

local a = {}

for i = 1,20 do

    local key = tostring(i)
    local before = collectgarbage("count")
    a[key] = i
    local after = collectgarbage("count")
    print(i, 1024 * (after - before))

end

将输出：

字典部分扩容的规则与数组相似。由于字典保存的是Node，而数组保存的是TValue，因此字典部分单位元素的存储空间也会比数组大。

收缩

前边提到了数组部分的扩张和字典的rehash扩张，某些情况下，数组和字典部分分配的空间会收缩：

当数组中的元素置为nil时，数组不会立即收缩，而是在字典部分触发rehash时，才重新计算数组部分的长度，下边的代码演示rehash引起的数组部分的收缩：

collectgarbage("collect")
collectgarbage("stop")

-- -------------------------------------------------------

local a = {}

for i = 1, 8 do
	table.insert(a,i)
end

-- 此时 数组部分8/8 字典部分0/0

for i = 2, 7 do
	a[i] = nil
end

-- 此时 数组只有1和8有值 2-7都是空值 
-- 数组2/8 哈希0/0
local before = collectgarbage("count")
a[9] = 9
-- 数组部分 1/1   字典部分 2/2
local after = collectgarbage("count")

print((after - before) * 1024)

local arrayElement = 16
local hashElement = 32

print(1 * arrayElement + 2 * hashElement - 8 * arrayElement)

在触发字典rehash的时候，将字典部分的数据并入数组：

collectgarbage("collect")
collectgarbage("stop")

-- -------------------------------------------------------

local a = {}

for i = 5, 8 do
	a[i] = i
end

-- 此时 数组部分0/0 字典部分4/4

local before = collectgarbage("count")

a[1] = 1

local after1 = collectgarbage("count")

a[2] = 2
a[3] = 3
a[4] = 4

-- 数组部分 8/8   哈希部分 0/0
local after2 = collectgarbage("count")

print((after1 - before) * 1024)
print((after2 - before) * 1024)

当触发rehash的时候，首先计算新的array部分和hash部分的尺寸。计算出可使数组部分填充率大于1/2的数组部分的长度，其余的key存入字典部分：

/*
** nums[i] = number of keys 'k' where 2^(i - 1) < k <= 2^i
*/
static void rehash (lua_State *L, Table *t, const TValue *ek) {
  unsigned int asize;  /* optimal size for array part */
  unsigned int na;  /* number of keys in the array part */
  unsigned int nums[MAXABITS + 1];
  int i;
  int totaluse;
  for (i = 0; i <= MAXABITS; i++) nums[i] = 0;  /* reset counts */
  na = numusearray(t, nums);  /* count keys in array part */
  totaluse = na;  /* all those keys are integer keys */
  totaluse += numusehash(t, nums, &na);  /* count keys in hash part */
  /* count extra key */
  na += countint(ek, nums);
  totaluse++;
  /* compute new size for array part */
  asize = computesizes(nums, &na);
  /* resize the table to new computed sizes */
  luaH_resize(L, t, asize, totaluse - na);
}

/*
** Count keys in array part of table 't': Fill 'nums[i]' with
** number of keys that will go into corresponding slice and return
** total number of non-nil keys.
*/
static unsigned int numusearray (const Table *t, unsigned int *nums) {
  int lg;
  unsigned int ttlg;  /* 2^lg */
  unsigned int ause = 0;  /* summation of 'nums' */
  unsigned int i = 1;  /* count to traverse all array keys */
  /* traverse each slice */
  for (lg = 0, ttlg = 1; lg <= MAXABITS; lg++, ttlg *= 2) {
    unsigned int lc = 0;  /* counter */
    unsigned int lim = ttlg;
    if (lim > t->sizearray) {
      lim = t->sizearray;  /* adjust upper limit */
      if (i > lim)
        break;  /* no more elements to count */
    }
    /* count elements in range (2^(lg - 1), 2^lg] */
    for (; i <= lim; i++) {
      if (!ttisnil(&t->array[i-1]))
        lc++;
    }
    nums[lg] += lc;
    ause += lc;
  }
  return ause;
}


static int numusehash (const Table *t, unsigned int *nums, unsigned int *pna) {
  int totaluse = 0;  /* total number of elements */
  int ause = 0;  /* elements added to 'nums' (can go to array part) */
  int i = sizenode(t);
  while (i--) {
    Node *n = &t->node[i];
    if (!ttisnil(gval(n))) {
      ause += countint(gkey(n), nums);
      totaluse++;
    }
  }
  *pna += ause;
  return totaluse;
}

构造一个表

如果已知表中需要有哪些key，可以预先分配，详见这里，可以有效减少rehash的次数。

除此之外，预先指定key还能避免额外的内存分配：

collectgarbage("collect")
collectgarbage("stop")

-- -------------------------------------------------------

local before1 = collectgarbage("count")

local a = {}

a[1] = 1
a[2] = 2
a[3] = 3

local after1 = collectgarbage("count")
print((after1 - before1) * 1024)

-- 56 + 16 * 4

-- -------------------------------------------------------

local before2 = collectgarbage("count")

local b = {1,2,3}

local after2 = collectgarbage("count")
print((after2 - before2) * 1024)

-- 56 + 16 * 3

遍历

在lua中使用ipairs和pairs对表进行遍历时，内部实际上调用的是next函数来获取下一个key。优先遍历数组部分，接下来遍历字典部分。

int luaH_next (lua_State *L, Table *t, StkId key) {
  unsigned int i = findindex(L, t, key);  /* find original element */
  for (; i < t->sizearray; i++) {  /* try first array part */
    if (!ttisnil(&t->array[i])) {  /* a non-nil value? */
      setivalue(key, i + 1);
      setobj2s(L, key+1, &t->array[i]);
      return 1;
    }
  }
  for (i -= t->sizearray; cast_int(i) < sizenode(t); i++) {  /* hash part */
    if (!ttisnil(gval(gnode(t, i)))) {  /* a non-nil value? */
      setobj2s(L, key, gkey(gnode(t, i)));
      setobj2s(L, key+1, gval(gnode(t, i)));
      return 1;
    }
  }
  return 0;  /* no more elements */
}

赋值：数组还是哈希

Q&A环节的一个问题，当我们向table中使用t[i] = v这样的操作插入一对键值且键i是整数的时候，这组值会保存在数组部分还是字典部分呢？

void luaH_setint (lua_State *L, Table *t, lua_Integer key, TValue *value) {
  const TValue *p = luaH_getint(t, key);
  TValue *cell;
  if (p != luaO_nilobject)
    cell = cast(TValue *, p);
  else {
    TValue k;
    setivalue(&k, key);
    cell = luaH_newkey(L, t, &k);
  }
  setobj2t(L, cell, value);
}

当插入一个整数key时，先使用luaH_getint去获取TValue，如果找到则直接修改其值，否则走luaH_newkey在字典段创建新的键值对。

/*
** search function for integers
*/
const TValue *luaH_getint (Table *t, lua_Integer key) {
  /* (1 <= key && key <= t->sizearray) */
  if (l_castS2U(key) - 1 < t->sizearray)
    return &t->array[key - 1];
  else {
    Node *n = hashint(t, key);
    for (;;) {  /* check whether 'key' is somewhere in the chain */
      if (ttisinteger(gkey(n)) && ivalue(gkey(n)) == key)
        return gval(n);  /* that's it */
      else {
        int nx = gnext(n);
        if (nx == 0) break;
        n += nx;
      }
    }
    return luaO_nilobject;
  }
}

在使用luaH_getint获取值时，先判断是否在数组段，如果是则直接返回对应位置的TValue，否则在字典部分找，找到返回，找不到则返回nil。

function

结构和定义

我们平时所说的函数，在lua中统一认为是闭包。闭包由函数原型和闭包变量组成。函数原型Proto和闭包Closures定义于lobject.h第407行和460行：

/*
** Function Prototypes
*/
typedef struct Proto {
  CommonHeader;
  lu_byte numparams;  /* number of fixed parameters */
  lu_byte is_vararg;
  lu_byte maxstacksize;  /* number of registers needed by this function */
  int sizeupvalues;  /* size of 'upvalues' */
  int sizek;  /* size of 'k' */
  int sizecode;
  int sizelineinfo;
  int sizep;  /* size of 'p' */
  int sizelocvars;
  int linedefined;  /* debug information  */
  int lastlinedefined;  /* debug information  */
  TValue *k;  /* constants used by the function */
  Instruction *code;  /* opcodes */
  struct Proto **p;  /* functions defined inside the function */
  int *lineinfo;  /* map from opcodes to source lines (debug information) */
  LocVar *locvars;  /* information about local variables (debug information) */
  Upvaldesc *upvalues;  /* upvalue information */
  struct LClosure *cache;  /* last-created closure with this prototype */
  TString  *source;  /* used for debug information */
  GCObject *gclist;
} Proto;



/*
** Lua Upvalues
*/
typedef struct UpVal UpVal;


/*
** Closures
*/

#define ClosureHeader \
	CommonHeader; lu_byte nupvalues; GCObject *gclist

typedef struct CClosure {
  ClosureHeader;
  lua_CFunction f;
  TValue upvalue[1];  /* list of upvalues */
} CClosure;


typedef struct LClosure {
  ClosureHeader;
  struct Proto *p;
  UpVal *upvals[1];  /* list of upvalues */
} LClosure;


typedef union Closure {
  CClosure c;
  LClosure l;
} Closure;


#define isLfunction(o)	ttisLclosure(o)

#define getproto(o)	(clLvalue(o)->p)

函数原型中保存函数相关的参数、指令、变量、内嵌函数、调试信息等。单个函数占用的堆内存很大程度上依赖于指令数量、参数和变量数量等。

不同的闭包可以共享同一份函数原型。当通过函数原型和upvalue得到闭包时，会比较新的闭包所持有的upvalue是否和原型中缓存的闭包的upvalue完全一致，如果完全一致，则直接使用同一个闭包。

闭包变量

闭包变量UpVal定义于lfunc.h第35行：

/*
** Upvalues for Lua closures
*/
struct UpVal {
  TValue *v;  /* points to stack or to its own value */
  lu_mem refcount;  /* reference counter */
  union {
    struct {  /* (when open) */
      UpVal *next;  /* linked list */
      int touched;  /* mark to avoid cycles with dead threads */
    } open;
    TValue value;  /* the value (when closed) */
  } u;
};

多个闭包可以共享同一个upvalue。在外层函数返回之前，内层的闭包访问upvalue时更像是访问栈上的变量，v指向栈上的变量，所有的开放状态的upvalue以链表的形式存储于全局的虚拟机中（openupval）。当外层函数返回，调用栈收缩后，v会指向内部的value。

Upvalue的内存生命周期使用引用计数来管理。

参数类型

在函数内部使用的变量，有几种类型：作用域内的变量（调用栈内）、upvalue、全局变量。

在lparser.c中获取变量的逻辑（第270行）：

/*
  Find variable with given name 'n'. If it is an upvalue, add this
  upvalue into all intermediate functions.
*/
static void singlevaraux (FuncState *fs, TString *n, expdesc *var, int base) {
  if (fs == NULL)  /* no more levels? */
    init_exp(var, VVOID, 0);  /* default is global */
  else {
    int v = searchvar(fs, n);  /* look up locals at current level */
    if (v >= 0) {  /* found? */
      init_exp(var, VLOCAL, v);  /* variable is local */
      if (!base)
        markupval(fs, v);  /* local will be used as an upval */
    }
    else {  /* not found as local at current level; try upvalues */
      int idx = searchupvalue(fs, n);  /* try existing upvalues */
      if (idx < 0) {  /* not found? */
        singlevaraux(fs->prev, n, var, 0);  /* try upper levels */
        if (var->k == VVOID)  /* not found? */
          return;  /* it is a global */
        /* else was LOCAL or UPVAL */
        idx  = newupvalue(fs, n, var);  /* will be a new upvalue */
      }
      init_exp(var, VUPVAL, idx);  /* new or old upvalue */
    }
  }
}

优先访问局部变量和参数，其次逐层查找upvalue，最后查找全局变量。

尾调用

详见之前的文章，使用尾调用提高函数执行效率，避免溢出。

REFERENCE

https://www.lua.org/source/5.3/