Learn a full model update, then compress it before sending to the server.
First computes the full Hit during local training without any constraints, and then approximates, or encodes, the update in a (lossy) compressed form before sending to the server. The server decodes the updates before doing the aggregation.
Subsampling - Instead of sending Hit , each client only communicates matrix Ĥit which is formed from a random subset of the (scaled) values of Hit.
Quantize the weights -Improving the quantization by structured random rotations. The above 1-bit and multi-bit quantization approach work best when the scales are approximately equal across different dimensions.
In the decoding phase, the server needs to perform the inverse rotation before aggregating all the updates.