Re: Factor

Factor: the language, the theory, and the practice.

Split Lines

Sunday, October 12, 2025

William Woodruff recently noticed that Python’s splitlines does a lot more than just newlines:

I always assumed that Python’s str.splitlines() split strings by “universal newlines”, i.e., \n, \r, and \r\n.

But it turns out it does a lot more than that.

The recent Factor 0.100 release included a change to make the split-lines word split on unicode linebreaks which matches the Python behavior.

IN: scratchpad "line1\nline2\rline3\r\nline4\vline5\x1dhello"
               split-lines .
{ "line1" "line2" "line3" "line4" "line5" "hello" }

These are considered line breaks:

Character Description
\n Line Feed
\r Carriage Return
\r\n Carriage Return + Line Feed
\v Line Tabulation
\f Form Feed
\x1c File Separator
\x1d Group Separator
\x1e Record Separator
\x85 Next Line (C1 Control Code)
\u002028 Line Separator
\u002029 Paragraph Separator

This might be surprising – or just what you needed!