Golang Unsafe Type Conversions and Memory Access

Written by avk-ai | Published 2020/03/15
Tech Story Tags: golang | programming | go | unsafe | memory-allocation | memory-management | golang-types | immutability

TLDR Go is strongly typed, you have to convert variable from one type to another to use in different parts of application. But sometimes you need to step around this type safety. Unsafe operation potentially could safe a lot of allocations. It also allows to hack into any struct field, including slices, strings, maps etc. It's called unsafe for a reason. It could be needed to optimization of bottle necks in high load systems, where every tick counts. Go compiler understands that string is constant and put it in memory along with other constants.via the TL;DR App

Illustration composed from MariaLetta/free-gophers-pack, original gopher by Renee French.
Warning! You should avoid using unsafe in your applications, unless you 100% sure what you are doing. It's called unsafe for a reason.
Go is a strongly typed, you have to convert variable from one type to another to use in different parts of application. But sometimes you need to step around this type safety. It could be needed to optimization of bottle necks in high load systems, where every tick counts. Unsafe operation potentially could safe a lot of allocations. Unsafe also allows to hack into any struct field, including slices, strings, maps etc.

Change string example

String are immutable in Go. If you want to change a string, usually you have to allocate a new one. Let's hack Go a little bit with unsafe package, and actually change the string!
package main

import (
	"fmt"
	"reflect"
	"time"
	"unsafe"
)

func main() {
	a := "Hello. Current time is " + time.Now().String()

	fmt.Println(a)

	stringHeader := (*reflect.StringHeader)(unsafe.Pointer(&a))
	*(*byte)(unsafe.Pointer(stringHeader.Data + 5)) = '!'

	fmt.Println(a)
}
Result:
Hello. Current time is 2020-03-14 21:40:38.36328248 +0300 +03 m=+0.000037994
Hello! Current time is 2020-03-14 21:40:38.36328248 +0300 +03 m=+0.000037994
String is changed!
What's going on in this code? Following steps:
  1. Convert a string pointer to
    unsafe.Pointer
    . Any pointer could be converted to
    unsafe.Pointer
    and vise versa.
    unsafe.Pointer
    also could be converted to
    uintptr
    - address in integer form.
  2. Convert
    unsafe.Pointer
    to
    reflect.StringHeader
    pointer, to get Data pointer.
    reflect.StringHeader
    type reflects string internal runtime struct. Under the hood string is represented by length value and
    uintptr
    pointer to memory with the data.
  3. Move the pointer 5 bytes forward. With
    uintptr
    it is possible to do pointer arithmetic.
  4. Convert new
    uintptn
    to
    unsafe.Pointer
    , and next to
    byte
    pointer. This byte is actually part of a string.
  5. Assign new value to byte by pointer. String is changed!
By the way, following code will produce segmentation violation error. Go compiler understands that string is constant and put it in memory along with other constants, where changes are forbidden. That's why I used
time.Now()
in the first example.
package main

import (
	"fmt"
	"reflect"
	"unsafe"
)

func main() {
	a := "Hello. Have a nice day!"

	fmt.Println(a)

	stringHeader := (*reflect.StringHeader)(unsafe.Pointer(&a))
	*(*byte)(unsafe.Pointer(stringHeader.Data + 5)) = '!'

	fmt.Println(a)
}
Result:
Hello. Have a nice day!
unexpected fault address 0x4c3e31
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x2 addr=0x4c3e31 pc=0x48cf33]

Package unsafe

https://golang.org/pkg/unsafe/
Package consist of only 3 functions and single usable type.
func Alignof(x ArbitraryType) uintptr
func Offsetof(x ArbitraryType) uintptr
func Sizeof(x ArbitraryType) uintptr
type Pointer *ArbitraryType

type ArbitraryType
// represents the type of an arbitrary Go expression,
// not actually part of a package
Offsetof
gives offset of field in struct.
Sizeof
is a size of variable in memory, referenced memory not included.
Alignof
gives information regarding the alignment of variable address.
Some examples of using
unsafe.Offsetof
and
unsafe.Sizeof
functions:
  1. Unsafe array iteration and type conversion on the fly with zero allocations.
  2. Getting information about the actual size of structs in memory.
  3. Changing struct field directly in memory, with struct pointer and field offset.
package main

import (
	"fmt"
	"unsafe"
)

type Bytes400 struct {
	val [100]int32
}

type TestStruct struct {
	a [9]int64
	b byte
	c *Bytes400
	d int64
}

func main() {
	array := [10]uint64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

	var sum int8

	// Unsafe array iteration
	sizeOfUint64 := unsafe.Sizeof(array[0])
	for i := uintptr(0); i < 10; i++ {
		sum += *(*int8)(unsafe.Pointer(uintptr(unsafe.Pointer(&array)) + sizeOfUint64*i))
	}
	fmt.Println(sum)

	// Size of struct and offsets of struct fields
	t := TestStruct{b: 42}
	fmt.Println(unsafe.Sizeof(t))
	fmt.Println(unsafe.Offsetof(t.a), unsafe.Offsetof(t.b), unsafe.Offsetof(t.c), unsafe.Offsetof(t.d))

	fmt.Println(unsafe.Sizeof(Bytes400{}))

	// Change struct field t.b value
	*(*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(&t)) + unsafe.Offsetof(t.b)))++
	fmt.Println(t.b)
}
Result:
55
88
0 72 76 80
400
43

Pointer arithmetic and garbage collector

A
unsafe.Pointer
can be converted to a
uintptr
and vise versa. Only way to perform pointer arithmetic is with usage of
uintptr
. General rule is that
uintptr
is an integer value without pointer semantics. It is not safe to use
uintptr
as an only pointer to object. Garbage collector unaware of it and object memory could be reused. It is best to convert it back to
unsafe.Pointer
right away, after arithmetic operations are done.

Benchmarks of unsafe string to []byte conversion

package main

import (
	"reflect"
	"unsafe"
)

func SafeBytesToString(bytes []byte) string {
	return string(bytes)
}

func SafeStringToBytes(s string) []byte {
	return []byte(s)
}

func UnsafeBytesToString(bytes []byte) string {
	sliceHeader := (*reflect.SliceHeader)(unsafe.Pointer(&bytes))

	return *(*string)(unsafe.Pointer(&reflect.StringHeader{
		Data: sliceHeader.Data,
		Len:  sliceHeader.Len,
	}))
}

func UnsafeStringToBytes(s string) []byte {
	stringHeader := (*reflect.StringHeader)(unsafe.Pointer(&s))
	return *(*[]byte)(unsafe.Pointer(&reflect.SliceHeader{
		Data: stringHeader.Data,
		Len:  stringHeader.Len,
		Cap:  stringHeader.Len,
	}))
}
go test -bench=.
Benchmark results:
BenchmarkSafeBytesToString-8       257141380       4.50 ns/op
BenchmarkSafeStringToBytes-8       227980887       5.38 ns/op
BenchmarkUnsafeBytesToString-8     1000000000      0.305 ns/op
BenchmarkUnsafeStringToBytes-8     1000000000      0.274 ns/op
Unsafe conversion is more than 10 times faster! But this conversion only allocates new header, data remains the same. So if you change slice after conversion, string will also change. The following test will pass:
func TestUnsafeBytesToString(t *testing.T) {
	bs := []byte("Test")
	str := UnsafeBytesToString(bs)
	if str != "Test" {
		t.Fail()
	}

	// Test string mutation
	bs[0] = 't'
	if str != "test" {
		t.Fail()
	}
}
If you really need to speed up conversions or hack inside go structures - use unsafe package. In other cases try to avoid it, it is not hard to mess something up with it.
Thanks for reading and have a nice safe day!

Written by avk-ai | Go developer. Blockchain expert.
Published by HackerNoon on 2020/03/15