Understanding Strings, Runes, and Bytes in Go

Snippet

Understanding Strings, Runes, and Bytes in Go

In Go, a string is a read-only slice of bytes. Each character in a string may occupy 1 to 4 bytes depending on the Unicode code point. The built-in len() function returns the number of bytes, not characters. To get the actual character count, convert to []rune first. When iterating with range, Go decodes runes one at a time, making it safe for multi-byte characters.

snippet.go
go
package main
 
import (
    "fmt"
    "unicode"
)
 
func main() {
    str := "Hello, 世界"
    fmt.Printf("String: %s\n", str)
    fmt.Printf("Length in bytes: %d\n", len(str))
    fmt.Printf("Length in runes: %d\n", len([]rune(str)))
 
    fmt.Println("\nIterating by byte:")
    for i := 0; i < len(str); i++ {
        fmt.Printf("  [%d] = 0x%X\n", i, str[i])
    }
 
    fmt.Println("\nIterating by rune:")
    for i, r := range str {
        fmt.Printf("  [%d] = '%c' (U+%04X)\n", i, r, r)
    }
 
    fmt.Printf("\nIs '世' a Chinese character? %v\n", unicode.Is(unicode.Han, '世'))
}

Breakdown

str := "Hello, 世界"

Creates a string containing both ASCII and multi-byte Unicode characters

len([]rune(str))

Converts string to rune slice to count actual characters instead of bytes

for i, r := range str

Range automatically decodes runes, providing both index and Unicode code point

unicode.Is(unicode.Han, '世')

Uses unicode package to check if a rune belongs to a character category

Previous snippet Next snippet

From your library

Understanding Strings, Runes, and Bytes in Go

Related