I have a couple more ideas for performance improvements and am wondering how you feel about all of the following:
- How do you feel about replacing some of the internal maps that are keyed on field number with slices? My profiling shows a large chunk of time spent in map_access routines that could probably be heavily reduced by using fixed size slices (i.e if a messages maximum field number is 100 then we would use a slice with size 101. I.E
m.values could become []interface{}
- Adding "typed" APIs so that I can do something like
m.GetInt64FieldByFieldNumber(fieldNum). My profiling shows that the library spends a lot of time allocating (and collecting) interface{}. We could alleviate this a lot by adding support for typed APIs and then using union types internally for storing the data.
- Allowing allocation of
[]byte to be controlled. Right now I have no control on how []byte fields will be allocated. It would be nice if I could configure my own allocation function or somehow indicate that I would rather the library take a subslice of the buffer its marshaling from instead of allocating a new one.
- Probably the biggest improvement for me would be if I could unmarshal buffers in an iterative fashion (basically if I could do what the
codedBuffer and m.unmarshal method do internally where I could have an iterator that reads the stream and gives me the value of each field one at a time. That would probably make a huge performance improvement for my workload.
I realize that these changes may not align with your vision for the library so just wanted to get your thoughts.
I have a couple more ideas for performance improvements and am wondering how you feel about all of the following:
m.valuescould become[]interface{}m.GetInt64FieldByFieldNumber(fieldNum). My profiling shows that the library spends a lot of time allocating (and collecting)interface{}. We could alleviate this a lot by adding support for typed APIs and then using union types internally for storing the data.[]byteto be controlled. Right now I have no control on how[]bytefields will be allocated. It would be nice if I could configure my own allocation function or somehow indicate that I would rather the library take a subslice of the buffer its marshaling from instead of allocating a new one.codedBufferandm.unmarshalmethod do internally where I could have an iterator that reads the stream and gives me the value of each field one at a time. That would probably make a huge performance improvement for my workload.I realize that these changes may not align with your vision for the library so just wanted to get your thoughts.