Anyway, the paper. - Very long list of authors. I wonder if you can trace the intellectual lineage of the ideas by what the authors have worked on before. - The paper itself was surprisingly hard to find, and is a PDF formatted in a way that makes it hard to read on a phone.
- Pages 1 and 2 (except the second paragraph) are meaningless fluff. - The real content starts with "The Libra protocol" on page 2. This is mostly readable although at first they seem to equivocate between "database" and "ledger", and possibly "resource" and "transaction".
They seem to be describing basically a distributed database. This database is duplicated between all network members (it calls these "validators") and is versioned by a most-recent hash, like a git repo. This hash seems to be their claim to being a "blockchain".
I guess the idea is the "validators" will be run by Facebook, Mastercard etc and everyone else will issue database queries against the validators. They say validators "take turns" getting to set the next git-style "commit". I guess this is what you call "proof of stake".
This early bit is confusing wrt how much database history is preserved. There's a bit where they suggest you can effectively create a new validator by downloading the complete "transaction history" (why? wouldn't you only need one known good snapshot?)
The main thing I'm looking for in this paper is how much of the problems of blockchain they fix, which in my view is the process of realizing every single idea constituting "blockchain" is bad and the closer to Bitcoin you get the more impossible it is to do anything worthwhile
So far they seem to be rejecting the most dangerous idea in Bitcoin— "proof of work", which is the idea with the potential to literally destroy the earth since it rewards network nodes for wasting as much energy as possible (more than competitors = win) on useless computations.
Overall it seems like Etherium if Etherium ever made good on their promise to ditch PoW (they've even got a little smart contract language!), which is an interesting achievement, but still worrisome because of the blockchainy assumptions that are still baked in:
1. Every database validator holds all content — a validator can't opt out of some popular but "useless" data type, like cryptokitties 2. The full history matters, which means in 2080 you'll still be lugging around data from this year. The network cost increases with time
Anyway, moving on from the summary. Looks like this isn't git after all, it's SVN.
It sounds like the amount of money in every account is just a publicly accessible number that anyone can query. This is weird. In Bitcoin you could incidentally calculate how much money a mailbox has but I don't think (?) you could just ask for it.
Ridiculously, they suggest that anyone being able to look at how much money is in your bank account is not a privacy issue because nobody knows mailbox #355565 or whatever is you. Because it's not like there's a global market for deanonymised personal information or anything.
So they've got account values you can query, and a general purpose programming language (Move). Users submit Move scripts to validators. It happens to be the case it's very easy to create a script that moves a number from account A to account B & generates a receipt (an "event").
They claim that the mechanics of Move somehow make it impossible to write a Move program that would violate the rules of a currency (like moving a money value without permission). Validators execute Move scripts without looking but that's always guaranteed to be "safe". Okay.
I can believe that much, though it still seems like a bad idea to offer a whole general programming language if what you wanted was a currency. If I were designing this I'd have made a more minimal language exactly apt to the operations they want.
Turing completeness is by definition an attack surface too large to police, and they gain nothing by it. The only clear reason is that they for some reason thought Etherium was a good idea and wanted feature completeness with Etherium. But this feature ended badly for Etherium.
In more plain language: They want a programming language so they can make useful self-executing "contracts", like "you deliver me X thing and I give you $6". But making it too powerful creates the risk of something that LOOKS like a contract to buy a sandwich, but is a trap…
…which just takes your money and delivers nothing. Or which breaks the whole network. In Etherium there was a disaster where a bug in a popular "wallet" program resulted one day in destroying all the money in all the wallets (every wallet using that program's code) at once.
Part of that was because the Etherium people had no idea how to design a language and made one where bugs were likely. But even a language that discourages bugs could have programs with malicious intent or backdoors (intentional "bugs"), crafted to look "normal".
I'm very curious what the answer to this question is. But I'm still not very far in the paper.
Notes on page 6: - The word "Gas", an Etherium vocabulary word, shows up here out of the blue. - "There is no concept of a block of transactions in the ledger history." The fundamental unit is the transaction. This seems to sidestep some performance limitations in Bitcoin clones.
Page 7 they seem to clarify something that worried me: The permanent history of the system does NOT need to be kept; rather, it is something that *can* be offered as a convenience to make it more likely clients will believe validators are not cheating. I… guess that's good.
Although it looks like you can submit any Move script you want, they *are* limiting Libra initially by only including the data types needed to run the currency. That sounds like a good idea honestly.
Although it *does* mean my plan to ruin Libra from day one by uploading several trillion copies of this image from "Iria: Zeiram the Animation" into the database is thwarted— for now.
So. There's: Modules (data types and basic operations on them) Resources (items in the database; can only be moved not copied; can only be created by modules) Transactions (scripts; cost Libra coins to execute; the only things that can change the database) Queries (free?)
Validators (these are the database shards; you submit queries to them, and they take turns executing the submitted transaction scripts) Events (receipts; the event record can be queried from validators, but they're not part of the database, they can't be read by transactions)
Each submitted transaction has a "bid" price of how much it's willing to pay the validator in Libra per cycle to run its script. If there's too many scripts for the validators to run, only the high bidders get run. There's also a "max I'll pay" value— this prevents infinite loops
I'm trying to figure out if there's a bottleneck here— if there are twice as many validators, it seems like the network can't handle transactions twice as fast, since globally only one validator runs at a time (?).
The basic primitives, like Libra coins themselves, are modules in the database like anything else. There's a "T0" transaction in the spec that creates the modules that let Libra coins work. Again, creating more modules beyond that is temporarily banned.
I'm not sure if it's possible to change modules, it seems like it might not be. If you can't, that raises questions of what if a module has a bug. If you CAN, it makes me wonder what happens if someone manages to change the Libra coin module (which of course is in the database)
The Move language used by the transactions has a bytecode form you submit to the server. There's GOING to be a version you can just write as a programming language, but it doesn't exist yet. The crypto rules are enforced by the bytecode verifier. This happens every time you run.
The description of Move in the basic spec seems to be designed to impress the reader very much by establishing that the spec authors have heard of all three of object-oriented programming, functional programming, and "ML", but communicates nearly nothing useful.
The important thing is modules are collections of stateless functions that operate on resources. You need modules because each resource has a module "type" and resources of that "type" can only be created— or have its fields altered— by functions published by the correct module.
Section 4.1 seems to be where things get real.
They start by explaining the validator (the database server) is not trusted, so every computation returns a result AND a quickly-checkable proof that it was performed correctly.
For non CS people: there's a whole class of problems where you can produce a "proof this is correct" along with the result and checking the proof is faster than calculating the result itself. This is a Thing.
(Also I suspect most computations done in Move are trivial—you need consensus on an answer more than you need the answer.)
They then spend a full page explaining what a Merkle Tree is. I'm not sure why. I think it's an example of data you can create proofs along with a result on
…so. I stopped here for a bit because honestly, I cannot figure out what the entirety of section 4 from 4.1 on is trying to describe.
So like I said, they start this part by describing Merkle trees. … incorrectly. Or at least in a heterodox way.
They say a merkle tree is: "a data structure used to store maps between integers and string values". I have not seen them described this way before. Here's the wikipedia description, which is pretty easy to read:
Merkle trees are about "hashing". For non CS people, this means taking a long string and creating a "effectively unique" fixed-size short string that's like a stamp of authenticity. If you do some math on your long string, you get the short string and it's the same one every time
It's hard to "reverse" a short string into a long string, so if you know the short string (the hash) already, and someone sends you something they *claim* is the long string, you can do the math and be certain it's the long string you wanted.
Merkle trees are a variant of this idea where you first chop up the long string into many small bits, and make a tree of hashes of these small bits. A good reason to do this is that calculating the hash for the full long string can be very slow & sometimes Merkleing is WAY faster
I think the reason Libra's describing Merkle trees weird is they're *actually* interested in the "Merkle tree accumulator", which is a different data structure *built on* Merkle trees. They refer you to a book (um), but I *think* the PDF is here:
Okay. So now that I've had a chance to read the Crosby 2009 paper I just linked on Merkle Tree Accumulators (I will call these MTAs): This paper *rocks*, actually. It's clearly written, it's clever, and it does something obviously useful. And it describes merkle trees correctly.
The idea w/ MTAs is you have a "log", an ever-growing series of strings. You store these in the leaves of a tree. When you run out of space, you grow the merkle tree at the top, doubling your number of leaf nodes (at that moment making the left half of the tree immutable forever)
They have some math so an observer can say "what's in the log at time 8023?" and the MTA server can return the value along plus a short-ish hunk of hashes that allow the observer to prove it's part of the same log and history has never been surreptitiously altered. It's cool.
(Caveat: The math that allows you to prove the log is consistent requires you to already know the hash of at least one value that's already in the append-only log. But in Libra you always know at least the "Time 0" hash, because it's in the spec.)
Armed with the explanation of MTAs, section 4.1 and 4.2 in the Libra paper make sense: The "transaction history" of Libra is an MTA. The reason Libra can throw away old history is they can keep just the handful of hashes necessary to construct the "same log" proofs back to time 0
At this point we notice something interesting: *Libra is not a blockchain*! As already mentioned it doesn't have "blocks", & now we find there's no "chain". A blockchain is a data structure where you link data blobs end to end by hashing [old blockchain hash | new data blob hash]
Blockchains are used by systems like Bitcoin to create an "append-only ledger". Libra is also based on an append-only ledger, but they don't use a blockchain data structure to do it. They use this (vastly superior) "Merkle Tree Accumulator" data structure instead.
(And the reason I say MTAs are superior to blockchains: you can safely jump from time T to time T+2000 without having to download the full history in between, you only need the logarithmically shorter "incremental proof".)
Anyway, I'm not sure an append-only ledger is a thing that is actually *useful*, but okay, I'm sold, if you've decided you're going to create an append-only log, MTAs sound like the best way to do it.
So now I'm at section 4.3 and I'm stuck again. Here they're talking about the database *contents*— the Libra database consists of up to 2^256 key-value stores, bucketed by account number. Here again I'm tripped up by a strange and seemingly incorrect use of the term "Merkle Tree"
In sections 4.1 and 4.2, the phrase "Merkle Tree" is used throughout to mean "Merkle Tree Accumulator", which is not the same thing. In section 4.3 suddenly they seem to be using it to mean something entirely other— they have a diagram of some tries, and label them "Merkle Trees"
I guess what's happening here is along with their database of accounts they have a sparse trie which is an index of the accounts by 256-bit address, and they label each non-leaf node with a hash to turn it into a Merkle tree. OK. That makes sense, but I don't get why they do it.
If I imagine what advantages that gets you, my mind goes to like… sharding, having different branches of the account space be handled in parallel. But any optimizations I can imagine are precluded by their decision to use the "single threaded" append-only ledger ["blockchain"🙄]
They then describe the contents of the account buckets. The key-value store has a directory structure, which means if I were doing this I'd use a Merkle tree, but somehow this is the one place Libra DOESN'T use a Merkle tree. They take time to explain why this was a bad decision.
If you look carefully on page 15, they do something extremely upsetting: They state the key-value pairs are stored in order "sorted by access path", but don't specify the sort order. This is a pet peeve of mine. You can't!! Do that!! In a world with Unicode exists!! UGH
They briefly acknowledge the attack I mentioned earlier of repeatedly uploading full copies of "Iria: Zeiram the Animation" (1994) into the database, thus forcing this global financial system to waste space. They note they're considering later on introducing an idea of "rent"…
…where every account slowly drains of money proportional to how many kilobytes it's taking up. They offer "One example of such a policy" for rent but don't… like… make it part of the spec. I guess this matters less right now since there are no user-definable "modules" yet.
Chaining multiple "things left unsaid": - The current Libra implementation is designed only for Libracoin - A version is coming introducing user modules; plus more efficient representation for account contents; plus "rent" for excessive disk usage, making Libra general-purpose
A nice person who is working on the Move VM parts of Libra notes that though the spec doesn't define key ordering, the core implementation makes a decision. Looks like keys (filenames) in accounts are raw bytes, and it's bring-your-own-unicode-encoding.
The next section explains how "events" (receipts for a transaction) are stored. There's a MTA (which they again misleadingly call a "merkle tree") which stores these. It's not clear to me if there's one MTA for all events, or a separate event MTA per transaction.
They say they store in the transaction history ledger the "authenticator" (a term they've used throughout to refer to, I think, MTA proof blobs) for the events associated with any one transaction. I'm not totally sure how to reconcile that with the claim transactions can't—
—access events when they run their scripts. So maybe the "authenticator" is like a pointer into the event MTA/ledger, or maybe they do store the events in the transaction MTA/ledger but the transaction ledger can't be read by transactions. I don't know. But… I also don't care.
I don't actually need to know how they implemented this— what's important is if you know a transaction's ID number you can access all the events it generated, efficiently and in order, and they've adequately convinced me this is true. Fine.
The next part is section 5, where they explain their "Byzantine Fault Tolerant Consensus" protocol. This is the thing they have instead of proof-of-work or proof-of-stake, and it's what will make or break Libra as something you can use in the real world.
I'll read this tomorrow.
(A small note: A couple people have QRTed this thread with "I guess there's something to Libra after all!". I will note just because it's possible to make sense of the spec does not mean they've made something *useful*! I'll hold off conclusions on THAT until I'm done reading.)
Okay! I said I was going to continue this thread about Libra "tomorrow" and now it's nearly two weeks later but WHATEVER, I'M CONTINUING IT NOW.
Let's talk about Byzantine Fault Tolerance.
Earlier I suggested Libra uses "proof of stake". This was wrong:
If you have a distributed write-only ledger (or a "blockchain") you need some kind of consensus mechanism for determining the "true" ledger/chain. Bitcoin uses "proof of work". "Proof of work" has to die. When people claim Bitcoin causes global warming, "proof of work" is why.
The problem is there isn't a battle-tested replacement for PoW. There are *suggested* replacements, but none have been deployed in an environment comparable to Bitcoin— IE one where the stakes are high enough that if an attack exists, it's certain it will be found and exploited.
After all, we didn't (AFAIK) understand the real problems with Proof of Work (51% attacks, "it destroys the planet" etc) until Bitcoin got as big as it did. So I don't trust the crypto community to find the problems with PoW's replacements until they've been tested the same way.
For comparison, "Proof of Stake" was at one time batted around as a replacement. But no one can seem to make it even work. Ethereum— one of the chains large enough it's worth trying to steal from it— was supposed to switch a year ago, and won't until 2021.
(I'm not sure what went wrong with Proof of Stake. When people talk to me about it, they usually say something like "in practice it winds up being really Proof of Work". The Stellar paper, linked above, phrases this the most clearly I've seen:
Anyway. Nevermind Proof of Stake.)
Libra uses something called Byzantine Fault Tolerance— originally a concept from "real" big-data databases and not blockchain at all. As I understand, in that world, the concern wasn't database nodes being malicious but just disagreeing for innocuous reasons like race conditions.
Libra specifically uses a version of an existing protocol called "HotStuff": https://arxiv.org/pdf/1803.05069.pdf …which *was* designed for blockchain applications, and its paper is specifically concerned with the idea some of the database servers in the pool are malicious.
BFT (Hotstuff), as a replacement for PoW (hashbreaking), is the most important thing in the Libra paper. It justifies Libra's existence; without it, Libra is kinda just Ethereum.
The Libra paper spends two (2) pages on it. Maybe they just expect you to read the Hotstuff paper.
At this point one notices something: Other than Move, which is *modeled on* a thing from Ethereum but is otherwise a new work, there appear to be no new concepts in Libra. Rather they've picked a basket of existing technologies (like MTAs and Hotstuff) and fused them in one work.
This is not necessarily a bad thing! Libra was framed to us as a consortium choosing a best-practices cryptocurrency. So I guess it makes sense they go— pick Ethereum's smart contract idea but write our own language, pick THIS blockchain replacement and THIS consensus method.
The Libra paper makes a pretty simple claim about why they chose Hotstuff: They tested several BFT-based systems and this was the one that performed the best. Hard to argue with that.
I have one concern still, but I'll get to that in a moment. There's a hint in this screenshot.
"Byzantine" doesn't mean "complex" here. It's about the East Roman Empire.
The original 1982 Byzantine problem is about holding a vote without a trusted vote-counting authority. You want to convince each voter they know what the outcome of the vote was, and hopefully do less work than forcing every voter to manually hand-count every vote themselves.
Hotstuff has a wrinkle: rather than *one* vote, it assumes an ever-growing log (a write-only ledger— which in Libra, we have, the Merkle Tree Accumulator) and assumes a continuous stream of votes seeking majority agreement on "was the log X at time Y?" "was the log Y at time Z?".
The Hotstuff authors seem completely terrified of the idea that one validator will go out to lunch and the entire network will hang waiting for this one validator to respond. So they build everything around votes where they gather "enough" votes and then cut the vote off.
If you read the Hotstuff paper, they keep referring to
n: number of nodes currently in the network f: number of "faulty" nodes, or rather, maximum allowable # of faulties (n - f): at least this many nonfaulty nodes are assumed to exist "replica": what Libra calls a "validator"
For any vote, if you get (n-f) votes (this might be more or less than a majority) then the vote carries. Hotstuff calls this a "quorum". f is a tunable parameter; you can set it higher (as a % of n) for more security, or lower for faster votes with less network traffic needed.
Weirdly, the Libra paper also leaves "f" as a tunable value— you'd think as an *implementation* they'd have to pick a particular value. Libra mentions their Hotstuff implementation is stored in Libra itself (as a Move module) so I guess maybe the idea is f can change over time.
Hotstuff works in "rounds". In each round, a leader is chosen. The leader picks a block of content (in Libra, these are user transactions— Move scripts) to add to the ledger. Three votes then proceed: a "prepare" vote, a "precommit" vote and a "commit" vote.
The "prepare" vote is the actual vote on whether the new content is accepted. Once the leader has (n-k) accept votes, they send out a digest of the (n-k) signed votes— because remember, the point isn't just to hold a vote, but to convince all participants the vote was legitimate.
Once everyone has their prepare digests (vote tallies) they hold another acceptance vote, this time to accept the results of the vote.
And then they do it again. For… some reason.
Prepare accepts transactions; precommit accepts a prepare vote; commit accepts a precommit vote.
What the Hotstuff paper assures us is that there's this earlier protocol called PBFT they're based on, and the fact PBFT held 2 votes instead of 3 was a real problem. They say the third commit prevents an explosion in messages if consensus isn't immediately met (say votes fail).
So this all makes sense. There seem to be some problems though.
First off, we're so concerned about voters going out to lunch— but what if the *leader* goes out to lunch? Libra has an answer to that one, they invent a shared clock and define a timeout period. So that's fine.
Second: How do you pick a leader? Libra's paper says they've invented this cool extension to Hotstuff where leaders are picked randomly and as late as possible, so a malicious third party can't predict the leader and DoS them the moment they go live.
This turns out not to exist.
It turns out the reason the Libra paper's description of BFT is only two pages is they defer the details to a separate paper on "LibraBFT", their Hotstuff fork. This paper turns out to be *41 pages long* (the Libra paper is 29), though most of that appears to be proofs.
Here's the section on selecting a leader from the longer "LibraBFT" paper. And it says, basically: "We will pick a method for this later"
At this point, this is no longer a spec. This document is not enough to actually create an interoperable independent implementation of Libra.
A third, similar problem. When it's a validator's turn to be "leader", it picks itself the transactions to add; it can choose however it likes, probably picking the ones with the highest "bid" (see upthread). But where do the transactions come from? How do end users submit them?
The LibraBFT paper explains transactions-to-run are submitted to "a shared mempool", and Validators pull them out using "a mempool protocol". "A" mempool? The Libra paper describes its networking in detail in sec. 6— but *not* what the mempool is or how it works, anywhere I find.
These last two issues are annoying but excusable. At this point the Rust reference implementation is the real spec, I guess, and the spec PDFs are just an explanation of how the reference implementation works.
But then it gets worse.
My problem four: When that "prepare" phase happens. What *exactly* are the validators voting on? This step is a disaster in Ethereum. In Ethereum, *every* script that runs on the shared VM has to be executed on *every* node.
I was about to complain Libra is the same, but— wait!
Libra has "authenticators". They talked about this earlier. Libra computations are supposed to end by producing proofs that the result follows from the question. So Libra validators other than the "leader" shouldn't have to run the script— just the authenticator. Right?
Wrong!!! It's exactly the same as Ethereum! EVERY validator runs EVERY Move script and the "authenticator" is only compared to make sure everyone got the same one.
This isn't surprising to me, I guess, but it's disappointing they couldn't find a way to improve on Ethereum here.
Take a moment, think about what this really means. This is the *opposite* of scaling. If the network grows from 50 nodes to 500, it can't run ten times as fast. Instead, you do ten times as much work to get the same result. 450 more computers running the same program. Every time.
So I don't like this, but at least it's *no worse* than Ethereum. But then there's my final problem, and this is the one that calls into question for me if Libra is even viable.
They talk about N. The # of nodes in the system.
N is a constant.
*The number of nodes is fixed!!!*
Hotstuff/LibraBFT—and I'll say more on this later, but this is apparently a inherent property of the "BFT" type blockchain consensus algorithms, as opposed to something else called "FBA"—assumes there's a fixed list of validators and every validator has a full copy of that list.
Nodes can be added and removed— like most of Libra's innards, the node list is itself stored in the Libra database— but it's a nontrivial event, sort of like rebooting the network. How often are these reboots? "Less than" once a day, apparently. So maybe once a day, potentially?
Libra already told us when they first go live, all validators would be handpicked. But now we see even after they "open up", *per the protocol* the validator set can never be *actually* open— and will always be *small*. 1000 is their *hoped someday scaling goal*!
I'm gonna pause this thread for tonight now. But when I come back to it (tomorrow or next weekend, maybe) I'm going to have a lot to say about a simple question:
Is Libra "decentralized"? Can Libra or a network with Libra's protocol *ever* be "decentralized", even in principle?
There are multiple ways to embed @mcclure111's unrolled
1. Direct link
2. Use iframe
Sharing is caring 😍
Like this thread of @mcclure111?
it with your friends & followers.
Love Thread Readers? Upgrade to premium to unlock all features
A whole new way to explore your interests. Convert your Thread to PDF,
save and print. Subscribe to interesting authors and be notified when new unroll is
available. Auto publish your threads on Medium and WordPress websites.