Protobuf

Language: CPP

Data

Protobuf was developed by Google to offer a compact and fast alternative to XML and JSON for data serialization. Developers write `.proto` files to define message structures, then generate C++ classes to handle serialization, deserialization, and validation, making it widely used in gRPC, distributed systems, and file storage.

Protocol Buffers (Protobuf) is a language-neutral, platform-neutral, and extensible mechanism for serializing structured data. In C++, it allows defining message schemas and efficiently serializing/deserializing structured data for inter-process communication, data storage, and network protocols.

Installation

linux: sudo apt install protobuf-compiler libprotobuf-dev
mac: brew install protobuf
windows: Download precompiled binaries from https://github.com/protocolbuffers/protobuf/releases

Usage

In C++, Protobuf uses generated classes from `.proto` files. You can serialize messages to binary strings or streams, parse messages from them, and use features like nested messages, enums, repeated fields, and optional fields for efficient and structured data representation.

Defining a message in .proto

// person.proto
syntax = "proto3";
message Person {
  string name = 1;
  int32 id = 2;
  string email = 3;
}

Defines a simple `Person` message with fields `name`, `id`, and `email` using unique tag numbers.

Generating C++ classes

# Terminal command
protoc --cpp_out=. person.proto

Generates `person.pb.h` and `person.pb.cc` C++ source files to work with the `Person` message.

Creating and serializing a message

#include "person.pb.h"
#include <fstream>

int main() {
    Person p;
    p.set_name("Alice");
    p.set_id(123);
    p.set_email("alice@example.com");

    std::ofstream out("person.bin", std::ios::binary);
    p.SerializeToOstream(&out);
    return 0;
}

Creates a `Person` object, sets its fields, and serializes it to a binary file.

Deserializing a message

#include "person.pb.h"
#include <fstream>
#include <iostream>

int main() {
    Person p;
    std::ifstream in("person.bin", std::ios::binary);
    if (p.ParseFromIstream(&in)) {
        std::cout << p.name() << ", " << p.id() << ", " << p.email() << std::endl;
    }
    return 0;
}

Reads the binary file and reconstructs the `Person` object, printing its fields.

Nested messages

message Address {
  string street = 1;
  string city = 2;
}
message Person {
  string name = 1;
  Address address = 2;
}

Demonstrates defining nested messages in a `.proto` file.

Repeated fields

message Person {
  string name = 1;
  repeated string phone_numbers = 2;
}

Allows storing multiple phone numbers for a single person using `repeated` fields.

Enumerations

enum PhoneType {
  MOBILE = 0;
  HOME = 1;
  WORK = 2;
}
message PhoneNumber {
  string number = 1;
  PhoneType type = 2;
}

Defines an enumeration to represent a fixed set of values for a field.

Integration with gRPC in C++

// In .proto file
service PersonService {
  rpc GetPerson(PersonRequest) returns (PersonResponse);
}
// Use protoc with gRPC plugin to generate C++ server/client stubs

Shows how Protobuf is used to define services and messages for C++ gRPC communication.

Error Handling

ParseError: Occurs when deserializing invalid or corrupted data. Ensure serialized data matches the expected message type.
Type mismatch: Assign correct data types to each field as defined in the `.proto` file.

Best Practices

Always assign unique tag numbers and never reuse deleted ones to maintain backward compatibility.

Use `proto3` syntax for simplicity and default values.

Keep messages concise for efficient serialization.

Use nested messages and enums to organize complex structures.

Validate data before serialization and after deserialization when needed.