-
Notifications
You must be signed in to change notification settings - Fork 32
{Deprecated. Newer at PR#20} PoET 2.0 Consensus #12
Conversation
Quick general comment, I would be careful with your usage of the word "valid". I believe it is good to distinguish between a block's validity status and block's commit status. A block's validity should only depend on deterministic computation, things like signature verification, structure verification, batch validity relative to a state hash, etc. It should not be possible for a block's validity status to change over time. A block's committed status however can and should change over time based on the state of the network and other block's that have been considered. To be more specific, I would not use "valid" in this sentence in the "Early Arriving blocks" section: |
text/0001-poet2-consensus.md
Outdated
1. Generate an ECDSA key pair. This pair is held in SGX memory and NEVER committed to disk for reuse subsequently. | ||
2. Create a table in SGX memory to map the Block Number to WaitCertificates. Also store the block number of the chain head in SGX. | ||
3. Generate a Quote containing the ECDSA public key and perform Quote verification with the IAS. | ||
4. Check if there was a previous ECDSA key registration. If there was, verify if ‘K’ blocks have been added to the chain since the key’s registration. If K blocks have not passed, wait until K blocks have been added. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if there was a previous registration by what? by the same EPID Pseudonym maybe?
If K means the same thing in PoET2 as PoET1 (maximum registration life) I don't think we want to wait k blocks, because if the process dies, that machine can't re-register for potentially a really long time.
text/0001-poet2-consensus.md
Outdated
2. Create a table in SGX memory to map the Block Number to WaitCertificates. Also store the block number of the chain head in SGX. | ||
3. Generate a Quote containing the ECDSA public key and perform Quote verification with the IAS. | ||
4. Check if there was a previous ECDSA key registration. If there was, verify if ‘K’ blocks have been added to the chain since the key’s registration. If K blocks have not passed, wait until K blocks have been added. | ||
5. Register the ECDSA Public key and the IAS response with the ledger. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need a better name for the keys. Consider using the PoET1 terminology, e.g. PPK. The fact that it's an ECDSA key is ancillary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice RFC! I have a few wording comments and consistency nitpicks.
text/0001-poet2-consensus.md
Outdated
This document details a new mechanism for the PoET algorithm that overcomes some of the challenges with the original algorithm. | ||
The intended audience of this document is architects, developers and anyone who would like to get a better understanding of the details of the PoET 2.0 algorithm. | ||
|
||
This RFC describes a new PoET 2.0 Consensus for supporting the SGX PSE-free machines. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The phrase "SGX PSE-free machines" could be interpreted two different ways: 1) Machines without SGX and without PSE, or 2) Machines with SGX but without PSE. My first interpretation was 1, so I was confused until later in the paper.
The next section uses the term "Platform Services". Would that work here? If so, this phrase would be more clear if it was something like this:
SGX machines without Platform Services
or
machines without SGX Platform Services
Otherwise, add "Enclave (PSE)" to the end of the phrase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And a minor note: Best tech-writing practice is to use the full term "Intel® Software Guard Extensions (SGX)" here, because it's the first time the term is used. But if that makes the sentence confusing or too long, it's probably okay to use "SGX" here, because the full term is in the next section.
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
In PoET 1.0, the PoET SGX enclave set a timer internally and would only release a WaitCertificate, once the timer elapsed. To be considered a candidate for consensus, a block (a.k.a Claim Block) would have to be accompanied by the corresponding WaitCertificate. This meant that peers receiving a Claim Block accompanied by the WaitCertificate, were assured that the block had not been forwarded prematurely, hence the phrase ‘Proof of Elapsed Time’. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: The term "claim block" is written inconsistently. At first, it's "Claim Block". The rest of the paper uses "claim block", but there's also two occurrences of "ClaimBlock" without the space.
Technical writers prefer lower case (claim block), because it's easier to read. Plus, fewer terms would have to change. But "Claim Block" is acceptable as long as it's consistent.
text/0001-poet2-consensus.md
Outdated
To handle clock drifts, it may become necessary to synchronize with a standard network clock. Details are out of scope for this version of the document. | ||
|
||
## Block Creation | ||
Once the algorithm is in a state to publish blocks (i.e. the C test passes), it starts the process of creating claim blocks with the aim of participating in the Consensus mechanism. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found "the C test" to be confusing. Is it the same thing as Step 6 in the previous section? ("Wait for ‘C’ blocks to be added ..."). Could you clarify here?
text/0001-poet2-consensus.md
Outdated
## Wait Certificate Creation | ||
Once a claim block is created, the PoET algorithm requests the PoET enclave to create the WaitCertificate. | ||
|
||
The contents of the WaitCertificate for PoET 2.0 are: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This information is hard to read in paragraph form. I suggest using a bulleted list. For example:
The contents of the WaitCertificate for PoET 2.0 are:
* Duration: A random 256 bit number generated by SGX
* WaitTime: The number of seconds to wait, determined by the LocalMean
.
.
.
If you don't like bullets, you could simply add a blank line between each item so that each renders as its own paragraph.
The third choice is an RST term-definition list, but I always forget the syntax for that one.
text/0001-poet2-consensus.md
Outdated
Upon receiving a claim block and a WaitCertificate, the receiving validator does the following: | ||
### Sanity checks | ||
1. Ensure the Block and WaitCertificate signatures are valid | ||
2. The Block’s ID matches the WaitCert’s blockID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another nitpick: I suggest spelling out "WaitCertificate" (here and elsewhere) instead of abbreviating it in a few places. Consistency is good. :-)
text/0001-poet2-consensus.md
Outdated
# Rationale and alternatives | ||
[alternatives]: #alternatives | ||
|
||
Poet 2.0 improves on PoET 1.0 adding support for PSE free devices. Current PoET 1.0 limits the availability of PoET consensus only to SGX PSE enabled devices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with my first comment above, I suggest using the term "(SGX) devices with/without Platform Services" (or something similar), instead of "PSE-free" or "PSE-enabled".
text/0001-poet2-consensus.md
Outdated
# Motivation | ||
[motivation]: #motivation | ||
|
||
PoET 1.0 offers a very unique and efficient way to achieve consensus based on secure timers set inside an Intel® Software Guard Extensions (SGX) enclave. The secure timer and the monotonic counters used in PoET 1.0 rely on _SGX Platform Services_ to access these services from the system hardware. _SGX Platform Services_, unfortunately, are not yet available on all SGX enabled machines. While plans are in place for introducing these services universally across the processor lineup, it will take some time to release the services and for them to become ubiquitous. PoET 2.0 address these challenges by making the PoET algorithm independent of the _SGX Platform Services_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PoET 2.0 addresses
these challenges by making the PoET algorithm independent of the SGX Platform Services.
text/0001-poet2-consensus.md
Outdated
10. If the block is invalid i.e. it is an early-arriving block, compare the duration with the duration of the claim block. If the block duration is smaller, drop the claim block. Wait for the WC to catch up with the CC. If the claim block duration is less than the incoming block duration, discard the incoming block. | ||
11. Proceed to step 6 once the block becomes valid. | ||
### Fork Resolution | ||
12. If the new block claims an earlier block as a parent, check for any cached blocks claiming the new block as a parent, retrieve them. Compute the new CC and check if the chain is valid (CC + E < WC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not this be a recursive call until a terminal child is reached within the cache? There could be a fork comprising of multiple (>1 block) in the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The text should be wrapped at <80 columns like the existing RFCs.
I think generally this RFC is currently light on detail given the extremely complex nature of the topic. It assumes a working understanding of current PoET, but should probably fill those knowledge gaps as well so it provides a complete picture. Otherwise, a lot of assumptions need to be made by the reader.
text/0001-poet2-consensus.md
Outdated
When the PoET 2.0 module is initialized, it performs the following actions: | ||
1. Generate an ECDSA key pair. This pair is held in SGX memory and NEVER committed to disk for reuse subsequently. | ||
2. Create a table in SGX memory to map the Block Number to WaitCertificates. Also store the block number of the chain head in SGX. | ||
3. Generate a Quote containing the ECDSA public key and perform Quote verification with the IAS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a regression, in that PoET 1.0 was specifically design to avoid talking to IAS frequently? In this sequence, a conversation with IAS is necessary between consensus engine restarts because the ECDSA key pair is not stored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The signup has to happen every time the process starts/system reboots based on the K-test, similar to what we have in poet 1.0.
The Wall Clock (WC) measures the time elapsed since the arrival of the ‘Sync Block’, i.e. the first block on the chain. | ||
The Chain Clock (CC) measures the cumulative wait duration of all the blocks in the chain (each fork will have its own CC). | ||
Upon receiving the Sync Block, WC and CC are initialized to 0. In practice, the WC may be calculated by recording the system time at the moment of the arrival of the Sync Block and subsequently subtracting this timestamp from the current time. | ||
To handle clock drifts, it may become necessary to synchronize with a standard network clock. Details are out of scope for this version of the document. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this RFC requires all nodes to sync time across the network to get correct behavior, this needs to be addressed here in detail, since it is both non-trivial and probably not realistic. No such assumptions are made in the current version of PoET.
text/0001-poet2-consensus.md
Outdated
ValidatorID: The ID of the current Validator | ||
Sign: The signature of the WaitCertificate computed over all the fields using the ECDSA private key | ||
|
||
Once the WaitCertificate is created, the enclave stores the Last Block # and the WaitCertificate in the table created previously. The validator will only create a single wait certificate per block. This implies that if a fork containing fewer blocks becomes active, the validator will NOT be able to publish any more blocks until the new fork has added enough blocks to catch up to the previous chain. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What impact will this have on attempting to keep a PoET network running? Many of the current PoET tests are problematic in testing situations, or situations with low node counts; this would seem to fall into that category, and presumably would increase these problems. The existing PoET tests (checks would be a better word here) can essentially be disabled as necessary for testing purposes, but this sounds more fundamental.
Signed-off-by: Ashish Kumar Mishra <[email protected]>
8403e7c
to
4fc6c81
Compare
11. Proceed to step 6 once the block becomes valid. | ||
### Fork Resolution | ||
12. If the new block claims an earlier block as a parent, check for any cached blocks claiming the new block as a parent, retrieve them. Compute the new CC and check if the chain is valid (CC + E < WC) | ||
13. If valid, compare existing chain length with the new chain length. If existing chain length is more, discard new chain. If the new chain is longer, switch to the new chain (commit the new blocks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question about this fork resolution process. Can someone explain why we only need to compare chain lengths (rather than check the total number of "work" done on the chain--i.e., incorporate the population estimate)?
Here's my thought process: suppose I controlled a small group of nodes. We could could fork off everything to a side chain and then wait for the "difficulty" to go down. In the steady state, due to the population adjustment, we would be adding blocks at the same rate as the main chain. We could let this go on for a while--maybe we would get lucky and accumulate blocks at a slightly (sublinear) rate faster than the main chain. Then, at some point, we could add in a bunch of new members and use these to add much more blocks than expected and try to catch up to the main chain. It's possible there's some mathematical reasoning that says this is impossible, but it's not immediate (and needs to be written up if this is becomes the spec).
Does anyone have an explanation for why this doesn't work? Am I misunderstanding something here? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Signed-off-by: Ashish Kumar Mishra [email protected]