Stateless Ethereum call #8

# Stateless Ethereum call \#8 00:00 Piper: Recording. There are people here. Let's Talk about some stuff. hello everybody. Glad to have you here to my zoom. 00:17 Here is a link to the meeting agenda. Such things. Rough order of business today is to cover reGenesis. I have an update on state network that I'd like to give and I would like to discuss where people feel like we're at where people feel like we are at on both code merkleization and binary trie application and discuss and essentially propose that we start moving those things forward into implementation and kind of EIP phases. 01:10 Does anybody else have anything they want to toss into the mix before we jump into those items V: The natural third thing in the triad in code merkleization and reGenesis is gas cost changes... Piper: Is there any specific area there because originally our gas cost changes were based in the assumption that we need to do witness gas accounting during EVM execution and we potentially it's not that we get to sidestep that forever but it gets less important if we think reGenesis is where we want to. 01:56 V: Right, but definitely don't think we should keep going so that yeah, Piper: alright it's fair. I think is it accurate to say that the sentiment here is that we still should be able to reprice gas and because we can that should be a high priority. Well, we can rephrase gas but yes, sorry because it's hard and we to reverse to it we should maybe not only a few people who are averse, but okay. 02:31 All right. I I I followed some degree and yes. I think that would be. I will add that to the mix. All right. I am inclined to push regenesis out to essentially the latter half of the meeting because I think it's going to be a bigger discussion and there's going to be questions and things. 02:53 So I'm inclined to run through the other things that I think are smaller discussions first. James: can you post that tweet Vitalik that you had about the gas costs in the chat and I can put in the agenda for people one second. I know you said it to me earlier right ( https://twitter.com/VitalikButerin/status/1275077913604354054 ) Piper: while that is kicking off I'm gonna start with these what I feel like is the smallest thing first and we'll jump in from there, so I'm gonna do this in order of I'm gonna give a update on state network stuff we will jump into binary tree then code merkleization then the gas costs. We have 90 minutes allocated for this meeting so I'm gonna give us the last 40 which gives us 30 minutes to get through 40 minutes to get through these these first things and the last 40 we can allocate towards free genesis and anything that we don't use we can have a little ad hoc talent show or singing contest, whatever. 04:04 Alright um, so um for those who've been following along right now it is looking like we can actually build state network via a DHT on top of the discovery v5 network. Felix has done some really fantastic work with discovery v5 there is a. There's the V5 spec which is what the beacon chain is the the testnets and everything have been running on there's an update to that which is the v5.1 spec it is not fully specified it is sort of split between an issue and a an open improper quest so those who want to implement it are would be running ahead of the curve right now, however one of the nice things that Felix did in the Discovery V5.1 spec is include a mechanism for extensibility. this essentially gives you a mechanism to build kind of secondary overlay networks on top of the DHT that support whatever extra protocol functionality. You can find so right now. I have been working on getting a v5.1 compliant Client in place that runs the V5.1 Discovery spec. 05:31 after which I will be cranking out a MVP of I'm still calling this state network, we probably need to come up with a better name because state network is a bad descriptor for this at this point because my primary goal right now is hosting chain data. so after somewhere in the next coming short number of weeks. 05:54 I think I'm gonna be at the point of starting to get a prototype up. I'll let like a two-month timeline. Here upper bounds is how long I think it's going to take me at which point my hope is that I will have a spec that I could but I will publish for people who want to jump in and try this thing out where you can use the DHT to fetch arbitrary chain history including things like fetching transactions by their hash so hosting the transaction indexes and things like that on canonical block indexes over this. 06:35 DHT based network. now the goal here is to allow clients to forget about chain history. some of the recent discussions from Tomasz with nethermind as well as I believe turbo death also has an option for freeing up about 160 gigs of hard drive space by ditching all the transaction receipts and this would provide a nice mechanism for getting the best of both worlds freeing up all of that space while still actually having direct access. 07:07 I'm similar with all the historical transactions blocks etc so this is essentially aimed at being a mechanism to allow clients to free up literally hundreds of gigs of hard drive space at the at the you know. Trading that off for fetching these things from the network on demand and participating in that network to help keep it healthy and the goal is for that to participation to be very easy and simple to do. 07:42 Anything else there oh and then one of the like core pieces that I'm that I'm planning on releasing with this sort of client is the ability to install it as a standalone and to spin it up and have it host a JSON RPC server it exposes the majority if not all of the JSON RPC APIs to do all will probably end up needing on some centralized infrastructure such as infura or something like that, if users opt into that but nonetheless one of the nice things about the DHT network host. 08:16 Ing all of the change history is the conserve the vast majority of the JSON RPC API without just by being partner. That's one that's kind of the one thing that I am focusing on in terms of like implementation myself all I'll continue to post updates to that in the discovery channel and the state network channel. 08:46 Questions anything otherwise we can jump on to discuss binary tree. okay, um. Here is the post from Guillaume, um on the ethresearch forums just where he essentially gives us a baseline for respect from binary tree, https://ethresear.ch/t/binary-trie-format/7621/17 Also this: https://ethresear.ch/t/overlay-method-for-hex-bin-tree-conversion/7104 um, I want to zoom out a little further though and point out that because of the pandemic we skipped one of the things that we would have been doing which is about somewhere around two months ago, we would have all gotten together in person. 09:31 That was one of the kind of promises or commitments that was part of this whole next research effort was to get together in person and just and work this stuff out because we can do so much more when we're in person together. Clearly that's not an option these days which does put us at a bit of a handicap. 09:54 Cool Alexey also posted another link in the chat for those that want to jump in. I will continue these things in a second. However, we can't do that which puts us at a handicap which is unfortunate, however if we had done that. I believe it would have been the point at which we could have gotten together hashed out any remaining kind of disagreements were, you know, getting to a point of agreement and, We would have started probably moving some of these things forward. 10:26 That didn't happen clearly. So what I am inclined to do is to kind of push forward and to figure that process out even though we can't get together in person. Its a lot harder to make these harder decisions remotely the decisions are called as a group is a you know, kind of a not fun or effective thing most of the time. 10:48 But it's what we have and so we try to make the best of it. My current understanding of the progress of research for binary tree is that we are at a state where we could start moving forward implementation and further and solidifying a spec for it. So, that's my essential like that's my. 11:17 Proposal? I don't know what to call it. I'm nominating that as the like topic of discussion here. So does anybody here feel like opposed to that in some way? that it's not ready? thinks that we should not do it? 11:38 Cool. On the other side of that, I am curious if there is any sort of like general tacit agreement that this is something that we could move forward on. It feels like something that we want no matter which direction we go with stateless, it feels like something that we want and need in Regenesis it helps out and in general it helps out in the longer-term room roadmap of getting Eth1 one into shards and being able to to support stateless execution. V: One question... I know this was discussed in ethresearch and we talked about this many times as well, but we've had conversations about switching to binary simultaneously with migrating from a two-layer structure to a one-layer structure. 12:31 Is this something that we want to stick into the plan or to? Talk. More about. Piper: So I got less motivated on that once. I realized that that can be modeled in a like more. What's the words here? So TurboGeth treats them as the same. Is that accurate Alexey? 12:54 Alexey: yeah so These whole division into one lower two layers is something logical and you can basically program it in the way that it looks like the same exactly same tree. Piper: There's a question here two layer versus one layer for anybody who isn't sure we're talking about right now all of the contracts have their own storage trie and most of the clients sort of model these as separate. 13:23 So you've got the global account try and then all the different storage drives and that can be a little frustrating to work with. a single layer would model it in the spec as if they were all in one trie and what we're talking about and then here I handed over to Alexey who has figured out that we even though they are suspect as if they are different tries, you don't have to be treated that way. 13:49 Alexey: Yeah understand here is that whatever spec you referring to probably yellow paper is yellow paper basically gives you the definition of what different types of trinodes are and it says that there's this leaf node there's this branch note whatever but if you actually add another type of model which we call account node then essentially what your d is you you're connected every storage trie to be entire trie and then this problem is basically solved because then account node has its own merkleization rule and this exactly what we do. 14:21 So we don't even think about them as two separate things anymore. They just like all the same thing. Vitalik: Do we even need to think about accounts as separate node structure then why not try to fold it into the same system? Piper: Well, so I think that this isn't necessarily an argument against that. It's an it's it's an argument to save it may not be strictly necessary. 14:48 I am still loosely in favor of it, um on our team we use a plus one minus one plus zero minus zero. the zeroes mean I'm in favor, but if somebody strongly disagrees then then you kind of trump me and the plus one's plot minus one's mean. I'm willing to argue with you. and I'm at this point I'm sort of like plus zero on Alexey: it just because it's not entirely necessary right as long as we the migration to binary tree we do not make it the situation worse than it is now then it's fine. If we do not actually increase the this sort of specialty of the storage tree because at the moment it does seem to be inherent but it's not it's actually not inherent and you can work around it in the code, right? 15:45 Piper: Um, one of the other things that made me less and that made the the merging of the trees less compelling to me was that it's still didn't seem to do anything about actually balancing the tree as a whole and that's one of the the things that contract storage has continued to bug me that it's so imbalanced and it's and it makes splitting the state up into equal pieces difficult V: right hand, you could argue the imbalance is good because it in the normal case when you access a contract and that contract access is a lot of storage so lots of cuts the witness size down right, which is awkward, right? James: I'm kind of curious to get your thoughts on this from the client maintainer perspective about what this would be like for your support or not. 16:51 Tomasz: I was looking at TurboGeth where storage is being treated as separate or the same thing... and really its when you calculate the state root... it just defines the optimizations where it's faster when it's slower. If you keep updating the chain from the root, and if you do that after every single update. Then you have a bit of a longer chain to keep updating. But if you do it seperately those chains are shorter but I was thinking recently about some optimization and merging it together again. I store it with them merged this put them version kind of works smoother now in another but not much of a difference... it changes a lot of the complexity in the clients. But for the general way you think about the tree it doesn't change that much. James: And then the transition in general from the the tree the tree conversion. your your thoughts on supporting that the tree conversion in general zooming out version, yes. Tomasz: it's it's up to you recent sides to tell that it's really bringing the results that we expecting to to bring and Alexey and Vitalik seem to be very convinced that this brings a lot of improvements and I do it so you know if this is excited there'll be improvements will do that. 18:36 I'm really not here to to have an opinion. I think it's a very risky operation. I don't know any ideas where we stop the world to do that. I don't like the idea is where we slowly transition. If it enables us to bring some other significant improvements of its critical for witnesses and so on it's fine but it is if it's three times improvement then I think it's not enough for the complexity of the operation. if it's ten times improvement than it's when we can start walking the business. 19:19 Piper: And I think this is sort of where I am in sort of why I am inclined to to move this forward is that research seems compelling to say this is this does the thing that we want it to do and we understand it reasonably well and at this point getting implementations in place is where we learn the next piece of information about where the complexity is where the you know, how this affects performance compliance. 19:50 It is a big change and thus. At some point here if we're going to do it we have to actually start doing it to find out where the where the the bodies are buried, right? 20:06 Alexey: So let's say that this is why I posted the link to the Guillaume's it's sort of article from may on eth research, he used to suggesting the overlay method for hexary to binary conversion. I'm I understand this is the specific to go-ethereum and actually this case this particular conversion will be very specific to the implementation and so I don't worry about this at all because it will be easy for us to do that exactly almost really oh but for go-ethereum might be less trivial but still achievable and I think going to. 20:41 The point that Tomasz raised is that the latest in that thread if you look at it is essentially Martin suggested to have the not completely freeze the chain but essentially disallow the transactions for certain amount of time let's say for 45 minutes before an hour so essentially there will be no transactions in the network and then that would enough be enough time for for the nodes to convert and then you turn them on again, so basically that's kind of the the latest that I have seen there. 21:13 I'm not for or against this I think in order to decide this you kind of what is the best transition thing is we need to look at who is going to participate in a transition and who is an option that someone implementations do not want to actually do the transition.. 21:16 But simply rely on some sort of temporary central area consensus about these binary tree roots, it could be like because this process doesn't happen all the time it will happen once and once you over the threshold and you've done it like if it happens for two hours then for two hours, you have to rely on. 21:52 On some kind of extra mechanism for publishing roots. I think we could get through it. So the most important thing is whether the the the different implementations can actually implement the binary tries. And if we do that then. Yeah, I don't actually like unless we want to utilize the witnesses and all sorts of ways and of course. 22:19 I agree with Tomasz then it might not be the the the necessary change. So it only makes sense to do that if we do want to utilize witnesses. 22:38 Piper: I and a lot of these like I'm happy for us to get into these issues but I think a lot of these things are things that are going to just be an ongoing discussion as we get further into this and as each client figures out, "this is how I'm dealing with it. 22:57 This is how I have to deal with it". And that some of those things as we do implementations, and as we have working versions of this. Will show us sort of. With you know, there's going to be some version in the global consensus of how the transition works but then there's also going to be individual client versions of how each client themselves chooses to transition within those bounds and like Alexey said some may actually do the transition and some may. 23:33 Opt to essentially just opt back into the world after the transition. without actually doing the conversion themselves. Chances are that's what Trinity will do just because we're going to be behind the curve on this and. It probably makes more sense for us to re-sync the state after the transition than it is to to. 23:58 The transition itself. But clearly everybody can't do that and so there's going to be other clients that get leaned on. I want to make sure we support them but this is just going to be kind of an ongoing discussion. Tomasz: Do we have Already calculations only size of this state in the traditional form of storing the full tree on the disk. How it's changes after we transition from Patricia Tree? Alexey: This is something actually this is something that Vitalik is published more than a year ago. So it is obvious that if you simply apply the same methodology that is applied now that each each node in Patricia tree is one database record then it's definitely not practical because your your tree is going to be like 4 times bigger and the database size will be like astronomical. 24:56 What he wrote is that you could still store the data as if it were a hexary trie, but treat it as if it were a binary trie. You are changing the way it is interpreted. So you still have the same size of records but you re-interpret it treat it as a binary merkle trie Rather than one node in the hexary... Tomasz: Yeah we discussed it... did we actually execute the calculation of a specific state. Did someone run it? Piper: Is the question how big is the storage if you store it naively? yeah if you're just doing leaves right and then everything intermediate is essentially the it lives in a cash. In then it should be the same size because the leaf sizes change Alexey: I mean I haven't run it myself because we don't need to do that because our state actually doesn't change at all but maybe somebody could do it in you know go-ethereum, but essentially this way I'm saying that the the the way you progress it is not. 26:30 Kind of more research on this but actually it's this now has to go into the implementation stage and of course to do that you have to look at the formats at the yeah and actually Vitalik just posted that link and of course we don't need to talk about it because I think we're understand that everybody agrees. 26:50 Piper: Yeah, um, so it for the sake of time, um, I'm I'd like to move us on to the next topic very soon what I will say is that um, so I'm always dis-inclined to get like for people that vote on things or whatever because voting sucks. Um, I am inclined to get this moving forward, um to do that what I am asking for is an EIP author. so somebody who is interested in in you know, I think I think Guillaume has like first dibs if he wants it but if he doesn't then I'm looking for somebody who is interested in being the author the primary author of the EIP and maintaining that over the process as we bring this to all core devs and implementers involved and all of the various feedback that are already. 27:51 Yes, yes, yeah, sorry. I'm probably bastardizing the pronunciation of that name. From Alex (axic) : There is already an eip draft by Guillaume Um, Oh excellent thank you thank you Alex so so my plan is to essentially surface this on all core devs either this Friday or in the following meeting just to put it in front of everybody in the proper channels and to focus on getting implementations started. 28:24 For this. So if anybody is strongly opposed to this or thinks that this is a bad idea now is the time to speak up and we can start figuring that out. 28:40 And. Paul: I just want to add a point that wasn't discussed yet, but just to make a record maybe to make people aware of it there's discussion originally on the ethresearch thread. Effectively, Guillaume re-wrote appendix D of the yellow paper, into a binaryification of it. And he defined a new Merkle rule, and I think Vitalik responded and said that the RLP bytes might not make sense, and then proposed a different merkleization rule... and then I responded with a different rule, and so now there are 3 rules to discuss, and I think that there might be some interaction there with the witnesses. So we might get some savings on the witnesses if we choose the merkleization rule carefully. So we're looking forward and there is work being done on this. So it looks like we're going to change from RLP then hash to some other contatinating hash.. of course the two leaves and then some prefix or.. that's an open question but I hope it'll be wrapped up soon. Piper I'm going to talk to Guillaume about the EIP thing because I'm sort of yeah... I think we need to specify the merkleization rule and that's all that has to be specified in the EIP and then everyone can look at that and say "yes we can implement this efficiently" but I think the EIP should be a new merkleization rule for binary trie and then a lot of other things are going to be inhereted from the existing spec. 31:03 Piper: Awesome. I yeah, I will admit that. I haven't done as thorough of a deep dive into all of the discussions so that'll be part of my homework before next called to to fully dig in there and make sure that I have. A nice grasp of all of the things they're in flight. my ask for an EIP author is is very much centered around just knowing what capacity I have but I do plan on being involved in that process and and trying as best I can to to give whatever feedback I can support James: and I'm happy to help anybody uses kind of in who wants to who has an approach to EIP process and wants some help on talking through it and stuff like that. 31:48 Piper: Well, all right. I am gonna we have got 10 minutes to stay on track here and so I'm gonna move us onto code merkleization. Sandra, Horacio, anybody from the team X would any of you guys like to give a brief update on where you guys are at with that? 32:09 Horacio: Yeah, I can talk about that. Well, first of all, our team had some some big restructuring recently so we are still getting back to the job, but our code is mostly finished we still need hopefully one-two weeks. One point we have a plan to probably something back with real numbers and. 32:44 That's the status. For now we're feeling is that probably the approach that will win will be this simple ones our approach of fully doing. static analysis probably is too complicated for the gain that we get, but Piper: We'll be keeping an eye out for for something coming out of y'all's team for that and that will be essentially about preliminary. 33:23 Free to make that a forum post if you think that that's the appropriate place if you are at a point where you feel competent enough in it a draft EIP is fine as well and I'm basically in the same roughly the same boat with code representation as I am the binary tree binary tree-ification which is that I think that we know enough about it. 33:45 I think that we understand the gains that we get from it and that they are useful and that it is that we're at where we can start looking at implementation and moving things forward. 34:00 Alexey: I wanted to add one more thing: one of the one of the people I work with they also look into this sort of little bit and so one we still work in on this static analysis thing which is it's more general than just trying to prove that there are not invalid jumps and it's not ready yet but we already we are getting close to building the reliability building the school the CFG which is the control flow graph on on essentially. 34:33 The smart contract just given the byte code and it looks like as long as they contract is being compiled by Solidity and for most of the others it is possible to to construct the the control flow graph and then a probability will be possible to prove that there are no invalid jumps in another thing. 34:53 I did is that I tried to look at history of Ethereum mainnet and find. All the transactions well this is actually just a curiosity thing which could be turned into optimization, so look at all the transactions that happen in Ethereum mainnet and figure out whether there was any transaction that actually utilized the jump dest analysis, which means that usually if you do the jump if it jumps into the opcode which is not jumped and you failed but if it jumps into the jump list, you have to check those code bitmaps and it turns out there was only three transactions in the entire history that utilized the code bitmap and I, I made it nice optimization out of this of course, but that's a curious find. and in fact all of these three transactions, they they were essentially attempt to create a contract and I'm pretty sure that they were specifically trying to test that feature yeah. 35:58 Piper: That is so cool and interesting that there's three transactions in the what what do we like 200 million close to billion now? I think oh right we're yeah, that's cool, um. So but in general terms, though. Who's opposed to this is there anybody who's got a you know opposition who doesn't think that we should do code merkleization. 36:31 Because I'm listening because otherwise I'm going to be looking at doing the same thing which is as soon as we have a spec that we can bring in front of all core devs presenting this in front of all core devs and working on championing named this into whatever subsequent hard fork, we can have it ready for. 36:53 I think code merkleization is probably. Is definitely an easier. Easier transition for us to do in a certain sense, but it also still does have the you know. You know need to touch large parts of the tree in order to to change so this is something that I'll be looking to move forward through allcoredevs and get get us implementations begun. 37:26 V: One addendum to the code merkleization thing is that it could potentially enable increase or removing the contract size limit -- which is something that I know there are people who want that, though that would only be possible to do safely if we have code merkleization and we have the gas rules to that base that charge per chunk of code accessed. and so... It's not necessary but it's also kind of. A thing that would be easy to add on the radar. 38:02 Piper: It's a decent lead in to changing gas costs, which is. The last item that we wanted to discuss before losing potentially the rest of the meeting to to reGenesis. losing is the wrong word. How about winning progressively. I don't know. 38:26 Okay, so as far as gas costs restructuring where I feel like I remember leaving off was being a bit just exasperated about it and feeling like none of the paths forward felt good. Some of that might have been just being sort of like frozen in the "none of these let us do it very cleanly without you know, potentially breaking some things". 38:54 You know going down a bit of a side tangent on how we can preserve metatransactions and removing gas observability. I'm certainly not heavily prepared to talk about what the best way forward here is Vitalik did you have a direction that you felt was compelling here? V: I'm kind of wondering whether it makes sense to talk about this now or as part of the reGenesis thing because one of the things about reGenesis that it just kind of bundles a bunch of issues together and that gas ends up being one of them. 39:37 And I do think there is a strong argument to pushing Gas cost changes earlier if we can. So one of those arguments right is that there's this second kind of elephant sword hanging over our head fact about the Ethereum network that if an attacker was do they can generate a block that takes 30 seconds to process... and this is something that we can just like alleviate at the drop of a hat if we just quadruple the gas costs for storage and accessing external stuff contracts at you know, the earliest hard fork. that Would let us do a change out of a change that's just exclusively a change to some config variables, but so Piper: So did I hear that correctly that that one of the things that you're kind of talking about here is just a flat chain to some of the prices that we know are mis-priced? 40:39 V: Right 40:43 Piper: Yeah in a security context with a with a compelling reason like you said the elephant wants to chop our head off? That one I think is something that can be you know, accepted and sold and moved through. Probably more of an all core devs discussion just because it doesn't exactly fall into where we're. 41:09 The the purpose of this call but in terms of broader like repricing or like kind of fixing the root of the problem is there anything that hmm, do you want to get into that now? I'm I'm I'm put that on the agenda going forward as well if. V: Yeah hmm, so I guess like for a little perspective of this call the goal is bounded witnesses right right? now, so we agree on that and yeah, I guess and bounding even if even if in a reGenesis context where you have kind of partially stateless nodes bounding also the fully stateless witness size. 41:53 Piper: I yes because because the reGenesis witness stuff sort of like has like a weird intersection but not fully overlapping stateless witness stuff. mostly because they come bundled in the transaction and are kind of like prepaid for as opposed to like pay as you go which is what block witnesses end up being. 42:16 V: Right and so basically there's just a choice of like increase the costs versus again some kind of witness pay-per-byte plan right? Right so this is like pay-per-byte versus gas cost increases and other stuff around like basically figuring out who pays and how sub calls working all of those things like I don't know if I have anything new to say like I think the only thing that I want to point out is just that kind of regenesis in its current form just kind of ends up bundling a lot of these decisions together when I don't necessarily think they need to be bundled. 43:02 Piper: Um, let's get into reGenesis and you can maybe point out where this things come up because that is an immediately clear to me, so. Alexey. I am inclined to give you first right of refusal to present regenesis. I've been enjoying presenting it it's not my idea but I've been enjoying trying to to figure out the the. 43:34 The concise presentation of it so would you like to do that? Alexey: I think it will be very useful for me to understand you know, how other people do it and also if you if you need something yes, Piper: yeah feel free to interrupt me at any point here, okay? reGenesis if somebody wants to drop a link into chat while I do this for anybody who wants to to dig in on the side. https://ledgerwatch.github.io/regenesis_plan.html Also on eth research: https://ethresear.ch/t/regenesis-resetting-ethereum-to-reduce-the-burden-of-large-blockchain-and-state/7582 44:03 So the general gist here is that at some we'll talk about this as a one-off event, but I think what we're really talking about more is this as a thing that happens on a regular cadence. So at the reGenesis point what happens is the state root stays the same. 44:24 So so from the block just before reGenesis to the block right after it the state root doesn't necessarily well the state root might change but the but we don't actually clear it out but what we do say is that everything that's in the state now falls into two different categories, either active or inactive. 44:45 And at the reGenesis point everything moves into into inactive and the active state is empty. There is still a state root that represents all of the state combined between active and inactive but but as far as the client is concerned the active state is empty and also as far as the client is concerned, they know nothing about the inactive state. 45:13 They don't necessarily hold it. They don't need it for consensus critical stuff. So. Transactions. Whenever they are sent must include a proof for any inactive state they touch. Any active state that the transaction touches is fine doesn't have to be included but if the transaction touches anything that is not in the active state it must include a proof for it. 45:44 this proof is presented up front at the beginning and everything that's in the proof is moved into the active state. At the moment that a transaction touches something that's not in the active state the transaction reverts. All of the state that was made active stays the same but all the state changes that the transaction might have executed during execution is reverted. 46:10 Oh, man. I'm losing thread here, um. 46:17 So at a high level that's the just here is that is that you sweep all of the inactive state sorry all of the state out of active clients are are no longer directly responsible for it. It moves the responsibility to the edge to the network which we'll get into here in a second and the active state slowly starts filling back up. 46:43 Rai: So just a point on one of your final points you're saying that even if the transaction reverts because it touches an active state that was no proof for everything in that proof becomes active state? Piper: Yes. So you could send the transaction that does nothing. and just fill the transaction data up with a proof and all of that inactive state that's in the proof gets elevated to active state regardless of whether or not the transaction executes successfully or not. 47:16 V: Like from an implementation perspective what you would give would be just like an epoch number and then move to the next reGenesis epoch and then... Piper: I think my answer is I have no idea at this point. I haven't gotten close to implementation details. I think that's gonna be Alexey is closer to that than I am. Alexey: I am yeah, so I would say the implementation would be that the entire state active plus inactive would be somewhere in the disk or something and the idea that the active stage should be small enough to be in the RAM or at least in a very like low latency state and so once reGenesis happens you essentially just ditch all the things that were around and you're just flush them on well you just forget about them and you start looking. 48:19 But yeah it's implementation detail V: I guess for me I care more about how things are merkled than how they are disked.. :) Alexey: oh yeah, this is actually something that I would I would completely separate yeah. Piper: Some of the other details here are that the the proof I believe can be against any state root from the reGenesis point up until now whatever now is so you can provide proofs. of like any intermediary state root there. the proofs do not have to be like fully valid against the the head state clearly because they're against old state and it essentially is just everything that's in there, that is not already in the active state gets elevated and, There's I think I think it's a it's a relatively graceful process where Alexey: actually Vitalik gave me an idea when well we were discussing it on ethresearch a few days ago and I haven't published the updates to be to the plan but I think it will be fairly interesting update and that would be even softer transition to it, which would probably eliminate a lot of these complexities that I came up with like all these pre consensus and stuff like this, so what is essentially suggested is that 49:43 The transactions will pre-pay for the witness essentially they prepaid for the space so transaction sender would not exactly calculate the witness but they will estimate what kind of size of the witness needs to be and as long as they give enough... so that's basically they just pre-pay for the state and that also means that they don't actually need to transmit the entire witness and then the transaction goes to the miners and the what the miner does it's actually it takes all the transactions that is now and if it does have the entire state there would be a deterministic proceedure about How they're going to generate witness for this particular transaction at this particular block, so the miner will have to generate the optimal witness and it would do it as far as there is a prepaid space in this slot. and once the if the witness ends up to be larger than the prepaid slot then it will only generate that part that actually fits in the slot and then essentially then it will be elevated, but if the entire witness does get in the slot then we know what happens so the nice property of this is that we sacrifice the The statelessness of the miner which we were prepare to sacrifice anyway in a stateless ethereum. But what we gain, we give efficiencies and simplicity on other levels, so first of all the transactions do not need to be including the witnesses they need to just estimate them and then you get the nice feature that the that you can even like overpay if you want to if you can't be bothered to generate the witness but you know, it's going to ballpark is going to be that much just prepaid go and if it didn't if it didn't happen we do it again right and repeat the same transactions. 51:29 Thing with the the sort of witness and then eventually if you're prepared to do it like three four times it will happen and so yeah, that's kind of interesting of the interesting modification. I would say. Piper: Um, I think this falls into a similar category of like there's some there's a couple of different models for how transactions could theoretically fail related to witnesses in the reGenesis model one of them is it just doesn't include a proof that for the state that it touches and in that model then at that point the transaction reverts. 52:08 The other is that there's some sort of dynamic state access going on and somebody in theory this is that when we're talking about like I want to be really clear here that like we care about these edge cases but they are also theoretically rare and probably quite rare but we still have to talk about them, so these probably aren't huge issues that affect everybody but they are issues we have to consider. 52:34 Alexey: is it dynamic state access? i can touch on this if you want Piper: yeah go for it Alexey: so there's a big difference between the DSA dynamic state access problem in stateless ethereum and in regenesis. because we don't we don't imply there is any state in the receiver. therefore each time you send the transaction they have to if we were basically providing witnesses with a transaction then you will basically have to get it right 53:09 From the first shot and if you don't then when you repeat your transaction it has to start from the scratch again, so you didn't get any advantage about by sending this transaction twice. in regenisis you do get advantage you're so if you think of the situation where it is the the honest sender of a transaction in the attacker who tries to manipulate the DSA in order to trap the sender into some kind of weird game whether the user never wins never and they never get to execute their state transition in this case the attacker always loses because eventually, The the user will send enough transactions to elevate everything possible into that active state and then it will get harder and harder for the attacker to try to manipulate the state in front of them. 53:57 Piper: Another way of saying this is that in the stateless ethereum model while the attacker still has to pay for sending the transactions there is still the same amount of missing state in each transaction because we assume the miner has no state and must operate off of the witness. 54:17 In the regenesis model every time the attacker moves stuff around they are filling up their also and the person trying to send the transaction. They are filling up the gaps in the state and there are fewer and fewer gaps in the state. It always ends at some point because eventually they either just fill up they touch all of the contract state and then all of the contract state is active and then the honest sender doesn't even have to. 54:47 Sam: Quck question... My solidity isn't that great... so when you like append to an array, or touch piece of state that has never been touched before do you need to include a witness? Piper: I do not believe that you would. Once all of the contract state is in the active state then then you don't like then there's nothing... Sam: So essentially you don't need to include proofs to branches that haven't been written to... Alexey: yes and that includes the branches that you provided or somebody else provided before you so it's actually if there's something is really actively used then you might not even need any witnesses for that because people are already using that stuff. 55:36 And that also explains why theoretically attacker eventually loses as long as it happens within the between two reGenesis events because of course when it's the other reGenesis happens, then everything is flushed again, and then you're the square zero. Piper: I want to point out to really nice benefits that come with regenists that at least I've liked zeroed in on them like a lot. One is that it essentially implements state rent in a sideways manner. because. When everything gets flushed the state that the miner needs effectively drops to zero in a certain sense and then everybody has to pay again not to write the state back to the state but to be elevated. 56:26 And so it's not a perfect state rent mechanism, but it does implement a economic bounds to the state size if this is done on a regular basis because it essentially means that while we won't wipe your state out of the tree. You will still have to pay for it every once in a while in order to get it elevated back to active state. Danny: Weren't we just talking about a model where miners kept all the state and provided the witnesses? 56:57 Piper: Alexey and I just don't necessarily. Agree there and I don't even I think that's even too strong in the statement it's more like there isn't a firm spec for this. I am inclined to push for this in a manner in which miners truly get to forget about the active state because one of my because this is the second really nice point that I like, which is that by saying, Miners are not responsible for the inactive state what you also do is you shift responsibility for storing the state onto the edges of the network because the transaction senders are getting value out of this network and by shifting the responsibility of managing the inactive state to transaction senders. 57:48 We we take responsibility off of minors we take complexity of the software that they have to run and we say "it's the rest of y'all's job and there's a lot more of you too" to manage this stuff. Alexey: after having this conversation with Vitalik on ethresearch I was going to rewrite the plan and I haven't done it yet so to make it like a different stages of the roll-out in the first stage we roll it out in the way that miners still need to maintain the unactive state and that would basically be the easier first transition step and then after that is settled we can go further and say and now we're going to introduce another thing where the miners will lose responsibility of filling up this witnesses and the instead. 58:35 We will put that responsibility onto the sender so in the first stage essentially the senders will start to pay for it but not necessarily provide it, and in the stage two they also have to provide it. and that's kind of the gradually increasing responsibility for the senders and reducing responsibility from the you know, from other to other people so I could I will re-write it but because I will remove a lot of complexity on other places. 59:03 Piper: I'll also print out EIP-2718 which is type transaction envelopes give us flexibility. In these areas which would allow us to introduce both models one in which if somebody wants to send a transaction that is asking for somebody else to sort of fill in the the state for them they could do that and it may or may not get accepted depending on if the miners out there choose to have the state or don't or whatever and the other model of providing the the proof up front so it doesn't necessarily even in either or. 59:41 V: Yeah, I definitely like the stage approach Piper: another thing that I'll point out that is really nice about this is that the inactive state is static so at the point where we treat where we do the regenesis event we can snapshot the inactive state and it becomes a lot easier to distribute via traditional mechanisms like bittorrent or something like that, you know because there's so many people in the edge who want the state it's a lot easier to sink the inactive state than it is sync the steadily moving target of whatever that state is. 1:00:21 Alexey: Sina: How long should the nodes keep the inactive state for after re-genesis? wondering in case a cross-boundary reorg happens Alexey: One just one interesting comment. I just saw in the chat which I haven't thought about it before and that's so therefore thank you for that it was from Cena the question was what happens when there is a real word in the age of the Regenesis. I actually haven't thought about it that's very interesting question. 1:00:36 I am going to answer it right now because I probably think about this. 1:00:46 V: So one of the things I kind of pushing for caution on the part of the step that involves taking away the need for the miner at the store everything this was the edge case. I tried to mention and I probably didn't explain very well. the edge case I concerned with this basically it imagined you have a contract which does a whole bunch of transaction which does a whole bunch of computation and then when that whole bunch of computation is finished, it's selects a random address where that address looks like it's somewhere in the middle of an inactive state. right and the challenge there is basically that if you're a client that just has the active subset of the state tree, then you know, that the thing is poking into an active state because you know, it's not part of the state tree, but you don't unless we either change the tree structure so that we are going to keep track of activity and inactivity at every node level, or we add a second tree, there's no way to know or prove to someone else that that state is accessing something that's active or inactive. Alexey: I mean that structure could be constructed pretty easily and the reason I say that about is that because we already constructed similarly structure which in interpreter and it is not doing exactly that but essentially what it does it it has the keys as a prefixes which from the roots of the state trie into the depth and the the the values are basically the hashes over the intermediate state roots and then you can easily add another kind of flag or something to say that whether it's actually unfolded or or it's folded. 1:02:49 the cost of that currently we're only storing the the even even prefix sizes because we don't want to deal with these kind of nibbles and stuff like that and it's about five gigabytes or six gigabytes of in our representation, so if you start storing the uneven prefixes as well, so that it would be much less than that but then if you remember that we only need to do that for very short like a top of that that tree right now, we'll say it's probably going to be two to three gigabytes that at most. 1:03:24 For this kind of structure. So I don't worry about this too much because I kind of yeah we've sort of done similar things. V: Right. It's like it's not that bad but it just thick as developed with complexity. Piper: Yeah understanding the implementation details and the implementation complexity is going to be a major part of this because just assuming that we can easily differentiate between active and inactive state and things like that are probably bad assumptions to make and we need to make sure that we that at the implementation levels it's straightforward to do. 1:04:02 V: So I think I can kind of articulate what I think is one of my kind of driving concerns which is that making some design that allows for miners to not store the inactive state is significantly harder and I do not want that to be a dependency of gas cost increases that allow us to have sane witness sizes 1:04:33 Alexey: Yeah, I agree with that. I think that's that's why I do like the proposal. I mean, which came out of our conversation to make it a multi stage transition. Piper: I didn't I didn't quite follow that. Can you restate that V: So reGenesis kind of bundles together the designs that allow for reducing the state that people need to store and all of those things together with witness gas costs changes. And a design that requires the miners store the full state is easy but he designed allows miners to not store it Is hard. and it adds great roadmap risk if we make the hard thing be a dependency of doing the gas cost changes. Alexey: basically the benefits that we will kind of get from reGenesis even even on the first stage will be significant, even if we don't if we still require the miners to store the active state the the benefits will see still be significant and then you can go even further later on one once the benefits you have already been demonstrated. 1:06:12 Danny: Do you see this as a stepping stone to full statelessness, or Do you see this intermediate solution as the end goal? Alexey: I think it could be the way to full statelessness. I don't know exactly how yet but. Piper: I see it as intermediate and that whole statelessness is still something that we want and need to go to. Danny: right I think a lot of this conversation should be contextualized by how an eth1 shard can fit into eth2 and a semi statelessness can work it changes the requirement the basis requirements for validators it and it prevents. 1:06:55 Adding state execution other shards. so if you have semi-statelessness on all shards, it kind of breaks, it breaks some of the fundamental designs of random sampling and puts it like a larger burden on all the validators and changes changes to let the middle game. which you just use shards as data and you kind of go all-in on rollups or something but this design you can certainly affect that path and so I over time I think we need to have that conversation. 1:07:25 V: So I mean. this specific thing but we would need for kind of all of the desired Eth2 properties is just is to have bounded and witness sizes both for fully stateless nodes and for semi-stateless nodes so that's just like purely at the level of gas which is something that I think is ok. 1:07:53 Alexey: I was going to say two things so mathematically if you look at it mathematically the parameter of the regenesis is one of the parameters is to the frequency of the resets right? obviously the limit where the frequency of reset goes to zero equals a full stateless, okay regenesis after every block basically that's what we're that's where you get full stateless client. so and then that's kind of the how you would would would would try that because if you want to go that path what you start have to do is, You start reducing the gradually reducing this frequency look we increase the frequency of resets. if people are comfortable with that. and also that also means that you can if you can bear the the increased network bandwidth which you can also gague very cautiously as you try to increase the frequency you can see how the bandwidth changes then you can say okay what if you try to get the gas limit to 100 million right is it still going to work the cost of basically increased increasing the network activity, why not? 1:09:03 And second thing I was going to say is that the day when I wrote this first reGenesis was the day when the gas price was sorry the gas cost so in gas limit was raised to 12 million and I think there was this Twitter discussion about it my first reaction to that was sort of like try to go through the rhetoric and analyze it but then I just pushed it aside and I said okay what we you know, is there a quicker path to this solution of this problem rather than doing stateless ethereum and this is how it came about essentially it's like, Quicker. 1:09:40 Quicker a more gradual stateless ethereum. Danny: Right one question how does affect max block size on the the point of regenesis are those founded in different ways than if it were fully stateless? Alexey: So if we think about how we use to think about stateless ethereum that the we have this huge witnesses potentially which we could actually lump into the block size for simplicity right then the the block sizes will be big like megabyte two megabytes and things like this that was the main criticism. if we do it for genesis, yeah, they will be big in the beginning, but they will trail off we don't have data for for it now, but we will prepare this data. 1:10:31 Soon at that point. Piper: So I think the idea is that the the size is would still be accounted for in in in the in the transaction gas costs because you would have the bytes for the witnesses maybe up front if that's how we're doing it to account for the data that they're including so you do end up with these like much bigger blocks because, Witness data. yeah and so I think that we still do see a network that has big blocks right after reGenesis and like I'll actually said it's going to tail off and yeah, my understanding is that you're planning on doing the you can you can essentially just rewind back however many blocks you want, pretend that you did reGenesis and then calculate how big those witnesses were. Alexey: We would so we can do similar analysis to what we used to do for the witness sizes for the block witness sizes. 1:11:35 Another thing is that now I realize that that these are kind of first step where we do not require transaction senders to fill to to to produce the witnesses is actually quite nice in terms of the introducing this change because well you have to do is essentially say that the transaction gas is now has to pay not only for the execution but also for the witness and it's sort of like it doesn't require any UX change it just only needs to only need to tell people that after this hard fork you are you have to give a bit more gas but and then you can estimate it by kind of trying to produce witness and see if it works. I mean, it's functionally the same mechanism as we calculate transaction intrinsic gas costs with the data the amount of data that's in the transaction. Danny: Could you reGenesis sub-sections of the tree, so like divide into quadrants and regenesis over time so that you don't have this like incredible drop-off of active state all at once and then another thing is would it makes sense to if you're not going to change the price of things and I don't necessarily agree with you if they should I'm just tossing it out there... 1:13:09 Could instead lower the gas limit at reGenesis and then have it linearly climb back up until the next reGenesis 1:13:20 V: That would just cause like 700 gWei spikes. 1:13:25 Piper: Yeah and I think that in terms of you know, slicing the state up and doing it as like a rolling thing is something that we could should be looked into the my my initial resistance there is just an implementation complexity but I think it's very much something worth worth looking at Alexey: This was a question that I think I was answering at some point that why don't we do this regenesis like as a cliff edge thing why don't we do it like more gradually and the answer to this is that the if you do things like rolling gradually speaking things and blah blah blah so they this could buy you some smoothness in terms of the curve but it introduces the complexity not just implementation but you have to describe to everybody including the people who are making transactions, what is the rule for deciding what is the current currently in the active state? 1:14:35 And if you try to this describe as rulers that always split it into two four parts the depending on the first nibble of the of the hashed address and then we to we take a recently last recently used algorithm on the latest 256 blocks by this time you basically lose everybody it's the reGenesis is probably the simplest description that you can get and I think it's already gonna be too complicated for most people. 1:15:09 to explain what is actually happening with a the state 1:15:16 Piper: So some numbers that I think came up with just for baseline is that after a million blocks sixty percent of the state was still cold Alexey: yeah but actually I now found a flaw in this analysis which I have to redo the analysis so what I did not do which is a big flow is that I didn't include the things that were read by transactions. 1:15:40 I want to include the thing that were read and written. Piper: So it's gonna be less than that but that's still is a very big number in terms of... Yeah yeah space savings right V: so I just think that the concern about kind of burst witness size and is very valid in the especially in the context of let's say an attacker pushing for blocks right after reGenesis that have huge witness sizes, so like my interpretation of this is that this is yet another reason why we need the gas cost changes to balance fully stateless witnesses in addition to and of the the regenesis stuff happening but like this is something that is I mentioned. 1:16:24 There's already five about five reasons to do so it's something that I would want to build consensus around of happening very soon. Alexey: so this is another interesting thing is that when we when we were discussing the downside of the stateless ethereum in terms of these big blocks coming through and usually when you're attached the the a lot of data to the block then it comes in the burst every 14 seconds or something in bitcoin for example, it becomes a burst every let's say 10 minutes right in ethereum it's every 1:16:59 14 seconds and this is rush to go through like within one second to just push all this data and then 13 seconds is nothing happening right and then another burst and another burst so if you think about fundamentally how to change that, how do you smooth out this burst because you still have another 13 seconds around rather than trying to see how we smooth in that can we smooth in the bursts within the actual time and the answer is to that is that if you think about transactions when they are even if transactions are large, let's say that they include a large witnesses, usually. 1:17:34 When you send a transaction in the theorem you don't expect it to be like confirmed right now like in two seconds you normally say, oh, you know might happen in one minute might happen in two minutes depending on the circumstances, usually it's not sort of super urgent sometimes it is but most of the time it isn't. but once the transaction gets into the local it basically becomes super urgent to get it through the network! why because we basically copy in the same information from the transaction to the block instead of just referring to it, so the optimization that seems to solve this and I think a lot of Systems already did it is essentially your your propagated transactions in the network as you would before the block and during the block transmission you simply refer to them and that means that you don't have such a rush to push that the same data around the network again and then so it smooths out all the all the traffic simply due to the the lower large numbers it's like in these put support mass service theories that if you just get around the number like collage number of people to send transactions at random, it will kind of be smooth enough rather than trying to push it every 40 seconds in a big bursts. 1:18:48 Piper: so we are four minutes from our from our time here, so I want to well I appreciate the discussions going on. I think we move some of this into the like discord channels and research topics and try to start. Over the next month narrowing in on what we mean when we say regenesis and come and essentially which is probably going to be a couple of like alright and now we're going to redescribe it again with all of the options that we understand our on the table, it's kind of trying to focus on understanding our options and then starting to narrow on what it is exactly. 1:19:29 To do that is also under the presumption that we want to do regenesis, which I am currently leaning more and more in favor of and it would be interesting to see where everybody else ends up at. There by the way in my statement thereof, like I want to do reGenesis that does not preclude wanting to do stateless ethereum what I see regenesis as is a place where we can gain something valuable for the network, oh along the way to stateless ethereum. 1:20:00 I'm making some some benefits there and that it's an easier goal post. and it's still have every intention of continuing to move forward towards state and security as well and to see this is parallel tracks that you can run with and it's likely that we can deliver. Danny: Can you just say why its an easier goal post? is it primarily because we only get blocks that are big sometimes? Piper: it's because we still don't have a solid plan for how to bound witness sizes with gas mechanics when witnesses are at the block level and they are pay as you go as we as you execute and that is still. 1:20:42 In my understanding and opinion a hard and unsolved problem that we don't have. A nice answer for. Alexey: actually that's a good question danny because if you apply the Vitaliks modification that I have applied to or I will apply to regenises if apply the same modifications to stateless ethereum, we do a lot of we might do on each think about a bit more but we might actually arrive at the model where we don't need to do a repricing I mean reprising with the opcodes so I need to think about this yeah, it's a good question. 1:21:20 Piper: Dit that answer your question of why I think regenesis is an easier goal than stateless? Danny: Just let me clarify... only at the point of reGenesis you can have blocks that are the same size as full stateless ethereum, correct? Piper: So it's easier for us to put bounds on those blocks because the witnesses are paid for upfront and so we can charge whatever we choose per byte of witness data, and if we want to put stripped upper bounds on block sizes, we can do that. 1:21:53 Via pricing transaction level witnesses in the specific way yeah there's no backwards incompatibility problems there and we have full control of that so that's why I see it as easier because we have control over it and we're not in between the backwards compatibility question. V: There Might still be backwards in compatibility issues around like who pays for the gas and sub calls and the ability for a parent call to make sub calls break right?. 1:22:23 I don't see that but Alexey: Yeah the semantics change because the because of the potential failure due to due to the unsufficient witness so because then when we say that when there's insufficient insufficient witness exception, we've called then the entire transaction including all the sub calls and everything reverts. There's no chance to appeal. 1:22:57 Piper: Yeah and okay I think maybe I see where you're getting at which is that it's still presents the problem of not being able to make safe sub calls that you know aren't going to blow out here self-payment for the gas that you pay for. I I'm going to just toss out that I think that we solved that indirectly via something like the EIP2718-based different transaction types that have different gas payers, but I'm going to we're going to have to leave that for another call we are at time, thank you everybody for your time. 1:23:32 Today. I'll probably chase some of you down with respect to those EIPs and whether or not we can make a preliminary presentation of any of these things in the all core devs call this week or whether or not. 1:23:48 I'm excited to move some stuff forward and to deliver some things and to get into the delivery phase. I'm also excited to figure out this reGenesis stuff James: Planting of a seed: discord has voice channels that people can pop into and discuss some of this stuff could be a good option nice. 1:24:08 Piper: I might do something like office hours and see if people care about that. I will have recordings available of this call for anybody who wants them. I do not push them publicly but you may reach out to me directly to get a link everybody have a great rest of whatever part of your day it is.

Read more

NxBn_blog_template

NxBn_blog_template

World Experience: Updates from the Next Billion Fellowship

Announcing the Devconnect ARG Scholars Program