3.1.2. String

3.1.2.1. Basic Definition

Strings in Dao represent a sequence of bytes of known length. Unlike most languages, Dao strings are not null-terminated, but are terminated based on the known length of the string in the code; thus, Dao strings are effective for storing both text data and binary byte sequences. When processed as text, however, Dao assumes a UTF-8 encoding.

String literals in Dao are expressed with either single or double quotes - “string data” and ‘string data’ are equivalent data in Dao. The only difference between the two is in escaping contents: within double quotes, you must escape double quotes, but not double quotes; while within single quotes, you only need to escape single quotes, but not double quotes.

var str1 = "She said, \"I'm going out\""
var str2 = 'She said, "I\'m going out"'

Both strings contain the same contents, but had to be escaped differently.

In addition to typical ASCII and UTF-8 characters, the following escape sequences are accepted:

  • d - Single decimal ASCII character (i.e., “0”)
  • ooo - Up to three-digit octal value (i.e., “141” == “a”)
  • xhh - Up to twwo-digit hexadecimal ASCII value (i.e., “x61” == “a”)
  • uhhhh - Four-digit hexadecimal UTF-8 value (prefixed by lowercase u)
  • Uhhhhhhhh - Eight-digit hexadecimal UTF-8 value (prefixed by capital U)
  • n - Newline
  • r - Carriage return
  • t - Tab character

A backslash followed by any other character represents that character itself. (i.e., “\” represents backslash, “f” simply becomes “f”)

3.1.2.2. Verbatim Strings

Similar to “raw strings” in other languages, verbatim strings allow text to be written directly, with no escaping necessary. Newlines and special characters are all kept as they appear.

A verbatim string in Dao is both started and ended by the sequence @[]. In the event you need to include that sequence itself in your text, you can optionally add a delimiter to that sequence, as @[delim], which will be required on both ends.

@[]
        This is a verbatim string.
        It will be indented and contain newlines.
        "And I don't have to escape anything!"
@[]

@[delim]
        This verbatim string contains ", ', and @[] in it, but it will be fine.
@[delim]

3.1.2.3. String Methods

class string
string(count : int, char : int = 0)

Creates a string of size count, filled with character represented by the integer value char

s1 = string( 2, 97 ) # "aa"
s2 = string( 5, "b"[0] ) # "bbbbb"
Parameters:
  • count (int) – String size
  • char (int) – Fill character
string(count : int)

[ index : int => string ]

Creates a string, calling the code section count times and appending the result to the string.

s1 = string(2) { "a" } # "aa"
s2 = string(2) { "bb" } # "bbbb"
s3 = string(5) { [index] "hello world".char(index); } # "hello"
s4 = string(6) { [index] (string)index; } # "012345"
Parameters:
  • count (int) – Number of times to call the section.
  • CodeSection (CodeSection) – Code Section (see below)
CodeSection(index: int) → string

This code section will be called a number of times equal to count. Each time, the parameter index will increment by 1. The result of this code section will then be appended to the string being built.

Parameters:index (int) – Iteration count; starts at 0 and increments until it reaches count.
Return type:string
size(utf8 : bool = false) → int

Returns the size in bytes of the string, or number of characters in the string. Note that the :cpp:function:`Size Operator <operator%>` is more efficient than calling str.size().

Parameters:utf8 (bool) – Whether to treat the string as UTF-8 data.
Return type:int
insert(str : string, pos : int = 0) → string

Returns a new string with the string str inserted at position pos.

Parameters:
  • str (string) – The string to insert
  • pos (int) – The location to insert the new string
Return type:

string

erase(pos : int = 0, count : int = -1) → string

Returns a new string with count bytes erased, starting at position pos. If count is -1, the string from pos to the end will be erased.

Parameters:
  • pos (int) – The location to start erasing
  • count (int) – The number of bytes to erase
Return type:

string

chop(utf8 : bool = false) → string

Returns a new string with EOF, ‘n’, and ‘r’ removed from the end. Only removes ONE occurrence of each, and in the exact order mentioned - if the string ends in nr, only r will be removed because they are in the wrong order.

Parameters:utf8 (bool) – Whether to treat the string as UTF-8 data.
Return type:string
trim(where : enum<head;tail> = $head + $tail; utf8 : bool = false) → string

Returns a new string with whitespace removed from one or both ends. The end to remove whitespace from is designated by the where parameter.

Parameters:
  • where (enum <head;tail>) – Where to perform the operation (beginning or end)
  • utf8 (bool) – Whether to treat the string as UTF-8 data.
Return type:

string