go / beginner
Snippet
Understanding Strings, Runes, and Bytes in Go
In Go, a string is a read-only slice of bytes. Each character in a string may occupy 1 to 4 bytes depending on the Unicode code point. The built-in len() function returns the number of bytes, not characters. To get the actual character count, convert to []rune first. When iterating with range, Go decodes runes one at a time, making it safe for multi-byte characters.
snippet.go
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
package mainimport ("fmt""unicode")func main() {str := "Hello, 世界"fmt.Printf("String: %s\n", str)fmt.Printf("Length in bytes: %d\n", len(str))fmt.Printf("Length in runes: %d\n", len([]rune(str)))fmt.Println("\nIterating by byte:")for i := 0; i < len(str); i++ {fmt.Printf(" [%d] = 0x%X\n", i, str[i])}fmt.Println("\nIterating by rune:")for i, r := range str {fmt.Printf(" [%d] = '%c' (U+%04X)\n", i, r, r)}fmt.Printf("\nIs '世' a Chinese character? %v\n", unicode.Is(unicode.Han, '世'))}
Breakdown
1
str := "Hello, 世界"
Creates a string containing both ASCII and multi-byte Unicode characters
2
len([]rune(str))
Converts string to rune slice to count actual characters instead of bytes
3
for i, r := range str
Range automatically decodes runes, providing both index and Unicode code point
4
unicode.Is(unicode.Han, '世')
Uses unicode package to check if a rune belongs to a character category