serde
Topics On
The Rust Programming Language
Edition 2018
Daniel Joseph Pezely
17 April 2019
The following was presented at the Vancouver Rust meeting on 17 April 2019.
This covers a powerful library for the Rust programming language, whereby paradoxically one’s software benefits by writing less code overall.
* * *
“My deep hierarchy of data structures is too complicated for auto-conversion.”
–someone not using serde
?
early and oftenOf course, all this applies to far more than just JSON, but JSON is easier for presentation purposes here and is likely familiar to a general programming audience.
Take time to read serde.rs entirely
before jumping into API docs at crates.io/crates/serdeYou’ll find it time well-invested!
(Spoilers: it’s resolved entirely at compile-time, and without run-time “reflection” mechanisms.)
Let serde give you superpowers by relying upon:
I. Decorate structs & enums with attributes
II. Write methods of auto-convert traits
III. Coalesce errors via ?
operator
IV. Bonus: Deep or mixed structures? Easy!
Attributes in Rust are like decorators in Python. These are compiler
directives for code-generation and related capabilities. The syntax is a
hash symbol (#
) followed by a clause within square brackets.
struct
or enum
declarationenum
struct
field or within enum
variantIf writing code handling common patterns: that’s probably the wrong approach!
If writing code to handle name or value conversions: that’s probably the wrong approach!
If checking for existence of nulls or special values: that’s probably the wrong approach!
?
operatorMake aggressive use of ?
operator;
e.g., use Result
and ErrorKind
together
Implement various methods of From
and Into
traits.
The compiler reveals exactly what you need,
so this becomes fairly straight-forward plug-and-chug
A common Rust idiom– not just a serde
thing– is using the question-mark
operator.
Populate a nested enum
and their variants from a flattened set. For
instance, each variant must map to exactly one Enum. Then, nested Enums may
be resolved when decorating with a single attribute
Ingest minimal data file structures to well-defined structures in Rust. For example, JSON without naming each structural component, where keys contain data (NOT name of struct).
Thus, have your idiomatic Rust cake and eat minimalist data files too!
(For those that are non-native to English, “Wanting to have your cake and eat it too” simply indicates the impression of a paradox. For those using serde, however, there is no paradox at all.)
Unpacking Minimalist JSON:
{
"energy-preferences": {
"2000s": ["solar", "wind"],
"1900s": ["kerosene", "soy", "peanut", "petroleum"],
"1800s": ["wind", "whale", "seal", "kerosene"]
}
}
Notable:
Starting From The Top, Serde can handle various naming conventions such as
snake_case
, camelCase
, PascalCase
, kebab-case
, etc.
#[derive(Serialize, Deserialize, Debug)]
#[serde(rename_all = "kebab-case")]
struct EnergyPreferenceHistory {
energy_preferences: EnergyPreferences
}
#[derive(Serialize, Deserialize, Debug)]
struct EnergyPreferences (HashMap<Century, Vec<EnergySources>>);
See serde.rs/attributes.html and particularly, serde.rs/container-attrs.html
Avoid merging concepts in an enum
. For instance, avoid the following.
enum EnergySources { // Don't mix categories like this!
Solar,
Wind,
// ...
Kerosene,
Petroleum,
// ...
PeanutOil,
SoyOil,
// ...
SealBlubber,
WhaleBlubber,
// ...
}
It would be more idiomatic Rust grouping them by category, instead.
Continuing from previous example:
enum EnergySources {
Sustainable(Inexhaustible),
Animal(Blubber),
Vegetable(Crop),
Mineral(Fossil),
}
enum Inexhaustible { Solar, Wind, /* ... */ }
enum Blubber { Seal, Whale, /* ... */ }
enum Crop { Peanut, Soy, /* ... */ }
enum Fossil { Kerosene, Petroleum, /* ... */ }
This is more idiomatic Rust, but our data file doesn’t look anything like this… Fear not!
(As an aside, focus on the Rust code, not precision of the categories above. For instance, pulp or pellets made from trees or other vegetable matter are all ignored here yet were in common usage during the late Nineteenth and early Twentieth Century within North America. Other divisions or categories might be better, such as petrochemical, oleochemical, etc. Or rendered, cultivated, extracted, etc.)
Decorate With Attributes. Attributes are a feature of the Rust language
and used extensively for fine-tuning how serde
and serde_json
behave.
Continuing{1} from previous example:
#[derive(Serialize, Deserialize, Debug)]
#[serde(untagged)] // <-- Unflatten from compact JSON
enum EnergySources {
Sustainable(Inexhaustible),
Animal(Blubber),
Vegetable(Crop),
Mineral(Fossil),
}
See “Untagged” section in serde.rs/enum-representations.html
For both pretty JSON and prettier Rust, use attributes to control naming of a field or variant. Then, one context gets to use a name that makes the most sense there and perhaps an entirely different name for the other context.
#[derive(Serialize, Deserialize, Debug, PartialEq, Eq, Hash)]
enum Century {
#[serde(rename = "1800s")]
NinteenthCentury,
#[serde(rename = "1900s")]
TwentiethCentury,
#[serde(rename = "2000s")]
TwentyfirstCentury
}
Each has its preferred naming convention.{2} Rust code gets idiomatic mixed case naming for enum variants, and JSON uses a more terse mnemonic for readability there.
Note use of additional attributes: PartialEq
, Eq
, Hash
. This
accommodates sorting and storage within a hash table or tree structure.
Use the question mark operator, ?
, early and often. This operator expands
to an if-else
that attempts to unwrap a Result
to its Ok
variant. If
the value is instead an Err
indicating an error, the else
clause
contains a return
statement.
As of Rust 1.26 (May 2018), its use is allowed within the main
function as
well.3
fn main() -> Result<(), ErrorKind> {
let json_string = fs::read_to_string("energy.json")?;
let sources: EnergyPreferenceHistory =
serde_json::de::from_str(&json_string)?;
println!("{:#?}", sources);
Ok(())
}
Note uses of question mark ?
operator
above.{4}
Implementing just the above, the compiler helpfully tells you exactly which
impl From
methods to add.
As an example ErrorKind
for use with Result
type{5}
and continuing from previous example:
#[derive(Debug)]
enum ErrorKind {
BadJson,
NoJson,
NoFilePath,
Unknown,
}
Implementing From
methods for use with ?
operator{6}
is aided by the compiler because it helpfully informs which methods are
missing.
As an exercise, comment-out all impl From
and see how the
compiler indicates exactly what needs to be written.
Then, it’s a matter of taste regarding how deep you go into
addressing each particular error to your own ErrorKind
.
There’s lots to love about Rust!
impl From<serde_json::Error> for ErrorKind {
fn from(err: serde_json::Error) -> ErrorKind {
use serde_json::error::Category;
match err.classify() {
Category::Io => {
println!("Serde JSON IO-error: {:?}", &err);
ErrorKind::NoJson
}
Category::Syntax | Category::Data | Category::Eof => {
println!("Serde JSON error: {:?} {:?}",
err.classify(), &err);
ErrorKind::BadJson
}
}
}
}
Other powerful features of serde
that offer writing less code, called
flattening, transparently hides one layer of nesting. A common use case
is to eliminate a wrapper struct
or the name of an inner hash-table.
For instance:
#[derive(Serialize, Deserialize)]
struct CatalogueEntry {
id: u64,
#[serde(flatten)] // <-- Field Attribute
description: HashMap<String, String>,
}
The above Rust code would ultimately produce the following JSON representation:
{
"id": 1234,
"size": "bigger than a car",
"weight": "less than an airplane"
}
Note that all fields get rendered to same level within JSON.
See serde.rs/field-attrs.html.
For writing the preceding item to a JSON file, the corresponding Rust code would be:
fn populate_catalogue() -> Result<(), ErrorKind> {
let id = 1234;
let mut description = HashMap::new();
description.insert("size".to_string(),
"bigger than a house".to_string());
description.insert("weight".to_string(),
"less than an airplane".to_string());
let catalogue = vec![CatalogueEntry{id, description}];
fs::write("foo.json", serde_json::to_string(&catalogue)?)?;
Ok(())
}
There’s nothing special here, because serde
handles iterables. You just
implement the trait.
Populate fields only when non-null makes for smaller data files, so there’s less to store, less to send over the network, less for a human to read, etc.
struct Thing {
pub keyword: String,
#[serde(default="Vec::new")] // <-- constructor
pub attributes: Vec<String>,
}
This yields an empty Vec
instead of Vec with empty string, and it gets
done without wrapping value with Option
.
In other words, serde
simply Does The Right Thing for you.
As incentive to read serde.rs documentation, be especially certain to see the section, Borrowing data in a derived impl.
When data has already been loaded and memory allocated: let your deserialized structs track only references.