I wonder how str.lower() is implemented in Python, so I cloned the cpython repository and did some search with grep. After a few jumps starting from unicode_lower
in Objects/unicodeobject.c
, I came across to this inside Objects/unicodetype.c
:
int _PyUnicode_ToLowerFull(Py_UCS4 ch, Py_UCS4 *res)
{
const _PyUnicode_TypeRecord *ctype = gettyperecord(ch);
if (ctype->flags & EXTENDED_CASE_MASK) {
int index = ctype->lower & 0xFFFF;
int n = ctype->lower >> 24;
int i;
for (i = 0; i < n; i++)
res[i] = _PyUnicode_ExtendedCase[index + i];
return n;
}
res[0] = ch + ctype->lower;
return 1;
}
I am familiar with C, but pretty unfamiliar with how python is implemented (but want to change that!). I don't really understand what is going on, so seeking help here for some clear explanation.