# Stateless Ethereum Call #7 transcript
### June 16 2020
I call this meeting to order! Hello everybody. Hi Vitalik. So my thought was that I wanted to and I literally had this thought of moment ago but I thought I'd be valuable to do a very brief recap of like basically what is the critical path to getting us where we need to go?
Probably gonna do a mediocre job of this so please chime in if I skip over something or get something wrong so the idea here is kind of like a quick reminder of kind of like what is the critical path for what we have to do?
So, um the working kind of backwards from the end is the way that kind of works best in my mind. So the thing at the end is essentially we have witnesses that are flowing around the network and we need to actually pay for them and make them mandatory. So that is the kind of repricing of essentially EVM operations to account for the size of witnesses.
That means we need a spec for what witnesses look like and then we need all of the things that reduce the size of witnesses so that means at that point we've done binary tree and code merkleization those are our two areas where we know we can gain the largest wins and get things ideally down to a manageable size to get to binary tree, we need to get away from `getNodeData`-based syncing so the idea there is a new sync protocol that allows us to sync in a manner that works under a binary tree.
That might be it. did. I miss something? code merkleization, binary tree, binary tree means we need a new sync, witness generation in the clients and then mandatory witness generation which means that we have to actually have bounds on witness sizes, which is the witness gas accounting.
Right that goes just about to say agree on gas prices, yep.
Okay, um, so scope of this. I've been more recently in the last few weeks focused on the path to get witness gas pricing. Sina and I'm gonna fail it identifying exactly who but there are people working specifically on kind of code mobilization and Sina volunteered to give kind of a recap of like what's been going on there.
I would like to talk in this call about the approach to sync for binary trees and I have the idea that maybe we can leverage snap or some variation of snap to get there because that's showing a lot of success and major. Major improvements in sync. Um, which leaves the actual binary treeification of the you know, state tree and contract trees as maybe the biggest piece here that doesn't have active work going on at least not that I am directly aware of
yeah, you're not aware of but I'm actually working on that
I love good. I love good surprises, so what it appears is that we've got some version of people directly working on each of these things and so my thought is I'd like to get a kind of group update of so that we're all kind of familiar with where we're at um with each of those topics and sort of any blockers that are standing in our way so we can start maybe actually cranking out some of our first EIPS that would be scheduled for the next hard fork for anything that that can be done now.
All right anybody got any comments on that before we move forward into that stuff? On my own.
Is there a is there a location where you have like any documents on like all your thinking up to date on each of those?
No, they're spread out but I am realizing that we are like at the time to like bring it all back together again and bring it all into one place so that somebody who wanted to catch up on what's happened today and how we got to where we are can.
So, I am game to to take that on maybe leaning on Griffin just a little bit for some of the help. Aggregating it all but I think that yes it would be a very good time to like get all of that written down maybe in the actual specs repo because now we've got this like more complete picture of how we're going to tackle each of these things.
Good question. I've been trying to do like recaps of the things and kind of like my realm but I think it's yes that there's there's a good opportunity here to to write it all back down and bring it all together into one form so that we can-- I don't know what we end up talking about these things and all these different directions and then like we might be talking about the same thing and we might not be because we've kind of had all these different ideas for how to approach stuff, so yeah getting an objective definition of this is what we plan to do so that we can talk about that valuable.
Alright on the docket today. I am going to not in this particular order necessarily. I am going to cover the witness gas pricing or the path to being able to price witnesses. That's meta transactions ungas and then actual having the freedom to reprice witnesses. Sina is going to talk to us about code merkleization. I'm going to try to lead a discussion on the sync snap versus maybe ditching Merry-go-round (MGR) for the time being if Snap looks like it's viable under binary trees. Alexey wants to talk about the turbogeth APIs that should be useful for researchers working in this department and Paul is going to give us an update on the witness spec. Did I miss anybody?
I just like to say hi to everyone. So one thing that I want this the group to take into account that is that actual mainnet current development and all core dev calls. Like hat is happening and what precompiles are we including or not including and what changes are we doing and how does that play into the actual goals for the 1X initiative?
Yeah, except. I mean, there are lots of discussion, but I I can't get the feeling of this group-- I don't want to be just like focused on research and future stuff. I mean, it needs to tie back into what's happening right now. Just a thought that I'd like to put out there with this group.
So is that kind of focus on that we are kind of around the corner from the Berlin hardfork... we've got some EIPs going in there and things that you see that kind of have a domino. Yeah trickle effect to what we're working on.
Yes or could have and things like I mean there exactly discussions should subroutines be restricted or not restricted the most little click of people who are involved or have opinions.
And then there, you know these discussions on BLS precompiles look like this. Should be use the WASM low level bindings and and it would be interesting to have the 1x research group have some opinions and I mean, I have a feeling at the same people who are involved in 1X should also kind of have opinions and thoughts about.
Current maintenance, but right now it's like closer to just separate groups.
I think there is a reason why they're separate groups at the moment and something that I try to point out before is that the core dev calls is very sort of very constrained in terms of time in about what you can discuss.
I mean in terms of the calls. So a, you know, I think the lot of the technical discussion should and will happen elsewhere. It cannot be all concentrated in this whatever two hours per every two weeks and stuff with that. So I think it's natural but
I mean the actual discussion about what does in or not would just be good more opinions of all the informed people who are in this one X group participate in the so you are suggesting this one to is is it to add this as it into the agenda today to just have a discussion of what's going on in there?
Could be but generally I just wanted to to tell people to please give your opinions and take part of mainnet 1.0 also because you guys are you know intelligent you have opinions, but that's nice you should not have to go into it more than that. I just wanted to say that.
Thanks Martin, it's a good reminder. I mean, I basically bailed on all core devs a while back and for various reasons that I've talked about here and there some but I don't consider that the most responsible choice to have made and I know that it would be better to have more of us involved there.
So, I don't want to eat up a lot of time on this call for that because I think we could go down the rabbit hole pretty far, but that isn't any way saying that I don't think that you have a point and I'm game to engage on this topic.
Alright. I'm gonna think about that and let it ruminate while we move on to this next stuff. So I'm getting to jump straight in and talk about the kind of witness gas pricing or the road to witness gas pricing. And that I don't know if there's a better order for us to do all of this in but I'm happy to get us started and then we can kind of move on to next things.
We've got five major items here snap. I see as being a discussion. I think code merkleization there's maybe some discussion there, but I think that wouldn't I don't want to peg any of these witnesses probably. A shorter or not a long one. So maybe we can kind of look at time box to any of these to 15 or 20 minutes.
If anybody notices that we've been on our topic for more than 15 or 20 minutes, will you please just speak up and just make note of that? Alright without further ado, um,
So we want to reprice things. It's one of the last things that we get to do but we need to be able to do it when the time comes.
That is problematic because of historical when we reprice opcodes and make things more expensive we can end up breaking contracts. One route to repricing things that is viable is for us to just eat the backwards incompatible change one more time to deal with whatever specific fallout we end up causing on chain and to just reprice things as they stand without actually.
Addressing any of the underlying problems with repricing things and backwards incompatible gas changes. I think that's still on the table and it still totally an option. But I think we actually have a path where we could actually fix some of these underlying things and get to a point where we have a lot more freedom to reprice opcodes.
And the path I haven't given this as a talk or a really spoken it all the way through so forgive me if this isn't the most cohesive explanation. ungas is a is a proposed approach to making gas unobservable in the EVM but the problem and and if we can do that, it makes it so that repricing opcodes is less contentious. It doesn't mean that we can't still end up breaking things but the type of breakage is not as bad. And in general, it gives us the freedom to reprice opcodes because gas is no longer inspectable. This is but the problem with ungas is that it is itself is a major breaking change and the use case that we specifically identified that it breaks are meta-transactions.
So we go back to if we break meta transactions with ungas we need to probably replace them with a new mechanism. That is "sponsored transactions" is what we've been referring to them to I'm on a computer that's not logged in to anything. So if somebody wants to dig up the EIP link and drop them into chat that would be great.
This was written up by Micah and it's been an ongoing discussion so sponsored transactions are essentially a transaction format that has another signature for the gas payer. So there's a proposal right now for Doing essentially introducing this new mechanism as a way as essentially a better way to do meta transactions.
In order to do sponsored transactions, we needed a new transaction format and we looked at some different ideas and the ideas well, let's just go ahead and introduce and envelope around the transaction itself. So there's another WIP that is essentially a dependency of this sponsored transaction EIP which introduces a envelope around the transaction itself with a kind of type field that indicates what transaction type is.
There we go, that's EIP 2718. Thank you. So the idea is that we introduced 2718. It doesn't actually introduce any new functionality, it just produces the new format which has an envelope and then we introduce 2711 which adds a new transaction type that that has this second set of signatures off of it.
After we do that and it's been in place for a little while which really is probably just like a hard fork or two. Hopefully just one. We can actually do ungas in the next hard fork and once again assuming that we can get it ready in fact in place and all the fees assumptions and then once we have ungas we can do.
Gas repricing. We just go crazy and just reprice everything. It's gonna be great. But it opens the door to us being able to reprice opcodes. So that's the line of thinking and why kind of meta transactions in this kind of sponsored transactions are leaking their way into this roadmap.
It is a bit of scope creep. And we do have a more like quick and dirty option available to us if we. At some point here decide that we don't want to take on that scope creep it doesn't mean there's things can't happen, but we've got a fast and loose approach that we can take which is just re-price things and and do some analysis to figure out are we breaking anything and if so how can we address it?
Or we can do this kind of more systemic approach that addresses some underlying. I don't know coupling in the EVM makes the EVM and theory better and I think it's a good approach and it's not actually too complicated. So, that's my take. Anybody that you feel good in any for what the
you seem to have a drop the oil.
Options as well. I yes, I for me still on the table.
Absolutely I'm I I guess I guess in the options that I'm thinking about these are the options there are more options on the table, there are I'm sure passed that we haven't thought of um, oil karma is still very likely viable option, um, I'm not convinced that it's the best option but.
It's still definitely there.
Yeah, so that would we still do it some work on it. I mean, it it does require some work but I I'm still determined to to try to convince you or not just you that it's it's it's a good option. I will say that
you've already won me over
so so I guess maybe what what I'm thinking is is that oil could absolutely just be part of this pipeline because I am not arguing that it doesn't make sense to have different mechanisms for measuring different things and trying to loop lump everything into gas like I do agree that it's kind of like collapsing a multidimensional measurement into a single dimensional unit, so I'm not opposed to the oil karma approach. I just the the part where I'm not convinced is that it actually is a... I don't think it solves the fundamental thing of like we still I think have to make gas unobservable. that's my take
so basically what I would say, yes is that we we could I mean, I don't want to essentially erase this path, so I think we do have to go both fast because the, The problem is that if we if we choose the one path now and then leave the other one as a sound real backup but we stopped working on it then but when the backup is required there's still going to be a huge amount of work to actually at least demonstrate its viable so I suggest that we I mean, I know there's a people want to do to go the the whatever sponsored transaction way that's fine and there are people who like me who wants to go the to the way of oil and that I said required some work and probably a lot of work and we will see where we're going to get because there are.
pros and cons in both paths and I don't want us to decide right now where what is going to be the ultimate solution.
can they be the same path? because because they like I now that I'm thinking about them there isn't anything intrinsically. Mutually
Yeah they could be at the same path although the the sponsored transaction essentially if you are you basically pick up something on the way. You picked a couple of things on the way which is these two EIPs and I would say that technically might there might be simple but they kind of operationally they might be tricky. as well as greater rarely are those things actually simple, right?
Um, yeah, so if we include oil in this then we're essentially taking on a like four step thing which is essentially four hard forks right because every one of them is a hard fork and maybe maybe I'm guessing oil could be combined and the kind of you know into the same.
The same thing but. Okay something for us to look at because I agree that like that is a lot of things to like knock down in a row when
I also wanted to comment on the first thing that you proposed which I think was we have this chat with V as well, you saw it on the ethresearch so they basically the assertion of the which was used for not yet searching assumption that you used for first path is that there aren't really many contracts that would be broken if you reprice gas because it's been known for a while that we should not rely on these things and this is probably this.
Might be well true but I started to think about whether we could actually quantify this somehow and some I don't have a solution yet but I have some ideas about how we can use some of the methods that we we are trying to use for oil to actually quantify other any contracts which do break when you reprice them.
I'm sure they're there are lots of there are some ways to do that.
On the topic have like do we has anyone come up with any like legitimate category of contracts that uses child calls with or what with limited gas other than meta transactions yet because like I feel like we've tried for a long time and meta transaction seems to be the only example that one has been talked about.
I mean I don't know but it would be interesting to to maybe try to discover them on the main net because that's what I'm talking about when we if we discover something which is which is dependent on which kind of breaks when you start changing the gas and then sort of analyze this cases and we if we basically come through the main net and we're realized, okay, these are the only things that we found and these are the meta transaction things and then we can say okay fine once we've sold the meta transaction, nobody should be offended if we start changing the gas.
I mean intuitively speaking like if there are meta transactions then you could imagine meta meta transactions and so on right like which probably don't exist in practice now but that might be one thing that we will exclude exclude for the future by making a bespoke thing for one first level meta transactions now yeah
By the way, I signed the comments someone asked about the gas station network or the gas station that work is a use case transactions.
Right now so there is a line of like quantitative research to be done here which maybe looks something like just running EVM trace on block after block and looking for calls that don't actually allocate the full gas. I'm not sure what else we have
we have a more kind of holistic approach which one go to trying to pursue and the context of I mean it actually there are a lot of context so.
What we're trying to do now is with Suhabe but unfortunately he's not on the call. So we're starting with the trying to do to have a much more robust algorithm or building the the full control flow graphs in I think in the in the sort of generic sense because what we realize is that any sort of tools that already exist for for control flow graphs, they're basically flawed they don't support this loops and anything like this, but there is actually a way to to do it much more robustly for pretty much majority like, overwhelming majority of contracts and then after that you can on top of that you can apply other things.
This is what we try to do for oil, we want to use this as the foundation for the to to provide the algorithm for meta transactions to to use the sale safe oil limits, but we can also use that try to prove that certain point shots are do not break, for example, if you start changing a gas price.
So that's what I would like to do is say if you prove that majority of the contracts do not break whenever whatever you do with the gas prices, but you find out where those the break or you are not sure then you can look at them specifically and say okay, what are these things and then at least you can kind of quantitatively cover those and you know, like what are you know, when somebody starts arguing with you what about the like a broken contract you see which one is which ones you're talking about? add to this list, you know, because these are the ones you are going to break.
So the argument actually very quickly descends to specific rather than being a philosophical or political.
And so this is something that I've forgot who you said, but a member of your team.
We working with Suhabe on this. Yeah, so it's only a very beginning of the work but we're going to be updating you afterwards. Like as we go
got it. So you so this isn't just theoretical there is somebody who is actually working on doing this quantitative research.
Yes, yes, so we we're trying to be sort of code up with code this up as well, but this is very initial.
Okay. All right, this is probably an area that I will continue to focus on and continue to try to provide kind of holistic new write ups on it on thanks to Micah for making the EIPs or these the initials that kind of this is all based on and and I agree Alexey that we should continue looking at these other mechanisms because, You're right. oil karma is it could go along with this it could be its own solution and you know, we still have the the, you know, quickest and dirtiest solution in our pocket if we absolutely need to of just replacing things and this quantitative research would at least tell us.
Well even if we replace things this is what we're breaking or we're not breaking anything important. So, Anybody else want to toss anything in before we move on to next whatever comes next? We've got code merkleization, Sina, if you want to go into that Alexey Geth and Paul Witness spec.
About yeah I wanted to bring something up about sponsor transactions I'm still not 100% sure how this actually addresses the braking changes that would be introduced with ungas or oil or anything because the thing that's really breaking is the ability to have these guarded sub calls and that would just be passing like some amount of gaps this call and catching a reversion and the sponsored transactions don't address that at all they only address the setting of the caller to the the actual sender the transaction rather than the gas payer right
yeah, so I think the idea is that meta transactions is just the only use case of guarded sub-calls that we've managed to find so far.
So yeah so then I just don't see why this is like I think it's a great EIP I love it but I don't see why this is part of Eth 1.X because if we don't care about guarded sub calls then we can just do exactly what we're doing right now verify signatures on chain.
so the thing is that you need.
Maybe I'm maybe I don't know maybe I'm not thinking this all the way through but the general idea is that that yes you're correct that that sponsored transactions do not intrinsically give you guarded sub calls, but they provide at least the baseline tool needed to re-implement transactions using kind of like, you know in protocol tools
But `ecrecover` isn't being broken so why can we not just continue using ECrecover and authenticating transactions like meta transations actually already are.
So the point is basically that meta transactions thoughts being secure if you have ungas because someone can make a meta transaction that consumes all of the gas and then prevents the includer from getting paid and the includer would have no way to stop it.
I think the concern is that under sponsoring transactions that still the same is that there still isn't a mechanism for making a guarded sub call so that you can guarantee payment. My take on that is that that is sort of a separate.
Mechanism that needs to be built and so the first mechanism that gives you like separate gas payer and better meta transaction stuff because it's you know, coming like it fixes the message sender problem is the sponsored transactions and then potentially the additional thing that the community needs to come up with are either like native batch transactions or something like that as a precompile so that they can do what would effectively be guarded sub calls, but done via like, A native batch transaction instead.
So my take is that we aren't going you know all the way to say here are here are all of the new tools that you need for meta transactions, it's this is one of them like kind of like the baseline one that that affects things and.
My hope is to get the community that's very interested in meta transactions more engaged and involved to help build out whatever other EIPs need to be made to make sure that meta transactions can still work.
So I feel like if we're going to take on an EIP for Eth 1.X it should be the batching transactions because that's the thing that's actually being broken by the ungas.
And I feel like that people who really care about meta transactions actually should be focusing on sponsored transactions because that's something that's really not possible right now anyways.
I think you I generally agree with you you we you can do some of the gas payer stuff or what you don't get is msg sender. So so everything meta transaction is effectively like a distraction from from Eth1X stateless Research but it's it still has to be looked at and addressed if we're going to break it and so that's the reason why I am at least spending a decent amount of time like kind of focused on that issue so that we can hopefully get more people involved in getting that fixed independent of this group rather than just saying "we are breaking it here's an idea for how to fix it, but we're not gonna do anything to actually make that happen".
I feel like that's what we're doing with this sponsor transaction though because we're not breaking anything related to msg.sender -- that's already not possible and so what we're doing is we're breaking guarded sub calls, but we're not addressing that with an EIP you know?
that is a very valid point.
It's possible that I have just been focused on the the first shiny thing that showed up in front of me which was sponsored sub calls, but refocusing on batching might be the right choice here given that you're right that it is that like you can use EC recover to deal with gas payer and have kind of the same status quo message sender today.
Um, whereas you you can't guarded sub calls using existing mechanisms if we break gas and perspective. Or guess visibility item. A good point. I'm gonna think on that for a bit and it may kind of change some of the focus and direction. I I still want to.
Play nice and try to shepherd getting those things through because. I don't want to giant political mess of we broke meta transactions and didn't do anything to help fix it. So some of this is I don't want to say it's like pure politics but it's very much like if I'm gonna break other people's toys. I should at least you know, make some effort to make a newer nicer toys or something. There's a metaphor hiding in there.
Alright, um, I'm gonna move us on. I think we're just to stay on track. I'm happy to come back to this and thank you. Matt, that is a very good point of like.
The sponsored transactions are maybe not the most important thing to fix for for addressing meta transactions. Alright, Sina, Alexey or Paul any of the three of you want to jump in and. Update.
I can go next. So I will just I give it a quick update on the recent developments since the last call and they were kind of under three points.
The first one is I compared the approach that I can before which was. the jumpdest chunking mechanism with the fixed size one and there doesn't seem to be a big difference. Especially when they have similar chunk sizes. So we could go ahead with the fixed size chunking if people would deem it simpler between these two at least and then.
The other one was I wanted to see how much efficiency can we get from the chunking algorithm itself like how much overhead is there to to the chunking mechanism in the approach like in the jumpdest approach. And I did that by measuring like seeing okay, these are the chunks that we sent how much of it was actually utilized to run the transaction?
And it seems to be that we can like expect Somewhere between 10 and 15% improvement with the like with hypothetically optimal approach. Although after I wrote this on discord, I realized that I would I had some more assumptions in there, so like ten to fifteen percent is not.
Really accurate but I think it's useful number especially for approaches that have similar chunk sizes. And yeah. And the third development is that there is a third two if I'm not mistaken to two more people working actively on this: Peter and Sandra I'm not mistaken from teamX pegasys and one of them is working on.
Implementing the fixed size chunker to just make sure that we have right numbers and I really appreciate that because there could be I mean, it's always good to confirm the numbers. And there also experimenting with the new approach based on chunking using solidity functions. But there are not right now here they could give an update maybe themselves, but I don't see them on the list unfortunately.
I think during their experiment they are also doing some other analysis like okay how many solidity contracts are there in the main net are there static like dynamic jumps or those frequent in main net and so on which could be useful for other purposes as well. And yeah finally Piper I know you hinted a few times writing a spec.
I've been talking to Paul who is who's helping with the witness spec and I'm seeing if we can start work on that at least maybe for one of the approaches but we still haven't done anything on that this is just like in consideration phase.
Do you think it would be possible to get like a draft EIP up that kind of links to the existing witness back and says like, Basically there's you know, there's a blob of technical whatever to fill in an EIP that can be kind of like to do but still wraps up the like the high level approach and gives enough context that we can get some of the broader conversation going so that we can like, maybe be on track to get this into whatever fork comes after Berlin?
Yeah, I think that's mmm, that's a good plan that would be a good one yeah
Because that way like there's still there's still a bunch of you know. Human readable language to be put in there to kind of like at least start to capture the like motivation and why are we doing this and all of these things.
Because Berlin is coming. I should actually know roughly when but I don't. But essentially I think it's time for us to start looking at like if we miss the next one then you know, we're with that much further behind on, you know, the things that we can get done now and so I'd love to start getting and and the reason I'm not jumping out and writing these EIP myself is because I know that if I do I'll be anchored to them and I'll be end up having to spend a lot of time
Are you currently talking about what kind of beeps about the witness format or what,
uh, So code merkleization is something that we could in theory do in the hard fork after Berlin. we we are I'm confident that it's a good idea for us to do at least if
But it's a thing that doesn't really give us the biggest kind of bang for the buck basically
My thought is that it's not blocked by anything else and if we are confident enough that it is the right thing to do then
it's kind of dependent on the Witness spec and the binary transition
How is it depending on the binary transition that isn't clear to me.
Okay, so it's not dependent but it would like the proof sizes would be smaller if we had the binary transition because in the end we're sending merkle proofs.
Well, basically the the the whole the code merkleization would be pretty much useless if you deploy the head of everything else.
And I don't think we should be deploying things that which are currently useless just the hope in that they're going to be useful later on because by the time you get to this point you might. Realize you actually should have done it in different way.
Okay, that's fair but it doesn't mean we can't have it ready in stage to go.
So yeah, yes but the sequence is important in sequence in which you are going to do the things.
Okay, that's fair. So the kind of implicit maybe even not implicit the assertion there is like we should roll out binary trees first or binary trees and code merkleization at the same time at minimum and not code merkleization ahead of it.
So, Since it's not definitively. It's not the most useful thing That said I still have envisioned us passing witnesses around the network well before they're mandatory and so anything that actually reduces witness sizes and lets us reduce that like kind of on-demand.
Well, the the passing witnesses around network doesn't require any hard forks.
You can do it at any whenever they're ready. So they're therefore. I don't really see a big point and on tying this to Berlin hardfork or anything like that.
Now, I think the idea is that if we add code merkleization that it becomes possible for the witnesses to have.
Only part of a contract's code and so any witnesses that are already being passed around get decreased in size.
that is my argument. I'm not saying it's a compelling one to say we should absolutely do this like right away.
I'd argue in some sense. It's a compelling one because code sizes are the kind of the the critical path and for the bottleneck in terms of like how big a worst case witness would be. Like right now a worst case witness is something like 400 megabytes and if we add code merkleization it could possibly go down to the tens of megabytes.
So very significant gain for like beam syncing clients, for example.
Yes, and and that is very much in line with my thinking of like while it may not be in-protocol gains that we get we still get some gains. We don't have to make a decision today. I'm not asserting that we should do these in that order just that we can.
And so we can come back to this. I think it's worth discussing out of band. And next call we can we can maybe start trying to have a map of like, Optimal. I don't know here is the order in which we could do all of the different EIPs and is there a benefit in this order versus this order that's sort of thing
actually one quick question on the code merkleization, what's the current preferred strategy for how existing contracts get transitioned over?
I can imagine something or Sina if you have thought this through I'm happy to hear from you.
No I I haven't given much thought I
I have to possible strategies if you want so structures number one is essentially I mean, I am still convinced that we should do the physics just fix size chunking whatever the we need to figure out what is the best size of the chunk though and binary trees, of course and there are two ways we could do it for my point of view one is basically one that doesn't require much of the extra research and the other one it does. so the first one is where you, Simply take all the jump desks the nations that you got from the analysis and you prepend it to the contract code in some ways and you merkleize that piece of information which means that and you make the first chunk basically mandatory, whatever whatever how many chunks contain the jump destination table so that in this way you achieve the the goal that you you always have a jump dest table whatever chunks you retrieve and therefore you can always tell which jump is and rather than which is not even though you missed you miss the particular bits.
And that's basically start to the number one don't
strategy number two is a bit more involved but it makes things a bit more decoupled and that relies on something that I mentioned before is basically proving that most of the contracts are in fact. Do not contain invalid jumps and that's actually I think it could be proved proven and so basically the client implementation would go through the contracts and will complete this analysis and remark them so if the contract is marked as the one where there's no invalid jumps then you can trivially essentially just chunk it up without any other consideration, however if something does contain invalid jumps it is exempt from code merkleization that's kind of the so that's the second strategy so I maybe there are some more but that's my.
Yeah. Vitalik is that the question you were asking? I interpreted as..
I was thinking of with a slightly different question which has that like obvious way there's going to be a procedure for changing the hash so that some particular contract has of the tree my question is more like would we run through the entire state and do it all at once would it be a poking sort of thing whether it be something else
I've imagined it as a poking thing and and right be like going ahead and getting account versioning out of this so that we can have a like.
Concept of upgrading accounts.
I mean, I mean considering the current size of the code which is not very large. I would say that the better strategy might be just doing it as a one shot because as far as I understand the code the size of the entire code currently is like 1.4 gigabytes or something on the main net when we even less than that so going or going through that is not actually super hard achievements and they also you can yeah.
And also another thing is that when we do the hexary a binary move like that seems like the sort of thing that's going to have to be done one shot so if we it might even be better to do all the one shot stuff at once or alternatively if we figure out the one shot machinery for the simpler code rehashing tasks where we could just reuse it for the binary upgrade yeah.
I'm pretty much in favor of developing the machinery for for these things because they might be immensely useful for lots of interesting things like you want to change the Hash function next and then after that you want to change to to remove the merkle trees altogether and this machinery will be very useful to do this pathway right well you wanted to tell us about the work which is for example.
I wonder if you can see this machinery for that.
I think the high level thing that I just want to like point out and this is sort of directed at Sina I want you to think about whether or not you're a game for being the the EIP-er of this and if not speak up and we can see if we can find somebody who does but whoever does write the EIP this is clearly something that needs to be part of it, which is what is the migration strategy and how does that work so just something that's sort of it's beyond the you know, not just what is the scheme for how we merkleize but how do we actually get to merkleization is something that needs to be part of that EIP.
There's also the the pricing question.
As in repricing essentially the cost of deploying code right because it now has merkleization costs?
and calling and like the witnesses so it has a lot of a lot of in common with the general witness.
so if I'm interpreted that as I kind of interrupted you sorry the we've got a pricing thing that comes down the line which is sort of like how do we price the individual usage of the you know, chunks of code? but then there's also just the like the actual cost of merkleization at deploy time.
And then maybe the cost of actually accessing stuff but that seems like it starts getting complicated. I guess it's the cost of like loading more pieces of the code into memory. I mean,
You can you can conceivably also do something else like you remember there is this limitation of 24 kilobytes for the for the size code size which was introduced as the response to the the Shanghai attacks.
Because I think there was no limitation before And so one of the reasons for that is because we simply yeah because we have these hashing which goes to entire contract and essentially which means that whenever you attach anything in the code you have to load the entire thing.
And so, but if you start using the code merkleization and you you don't actually have to load the entire thing. You actually can allow the limited size contracts in a way. If you if you work out the pricing which was basically dependent on how many pieces of your loading it's a bit more complicated pricing but a lot of people do desire this limitation to the lifted because at the moment if you do want to construct something with large, you have to use proxies like, you know, stitching together multiple contracts.
That's a good point. Although for stateless clients, it doesn't wouldn't make a big difference because they would receive only the chunks that are necessary to execute. So, I don't know how that would affect pricing.
I think what Alexey was getting at is that with code and localization we might need to up the cost of deploying code but we might not need to have we might be able to either lift the hard cap or these change some of the pricing for things like loading code, you know, loading the code from another contract things like that.
So yeah, because yeah because the reason why the you know, there's a cap is because you're whenever you, Load the code you have to load the entire thing. but if you don't have to do it you can just either load as you go like pay as you go when you load the code right something like this.
But then we get into the territory whenever you start talking about pay as you go you get into other like oily things and stuff like that.
Alright, so I'm gonna move us on to the next topic unless anybody has anything specific that they really want to bring up for this because we're theoretically like behind on time right now by a small amount.
Just who want to note it's not only the cost of loading but also the cost of analysis. Which is one of the reasons for the bound. And if you if you have that done it analysis, right you JUMPDEST analysis right now, but if the EIP is merged then there's going to be more analysis needed.
Okay, so looking into how subroutines and code merkleization may interact would be at least a good area to pay attention to.
All right, um Paul would you like to give us an update on witness spec? I feel like that ties in nicely with this and
yes briefly the spec for those who aren't following the repo and just follow the blogs or maybe the calls. We're in a period of stability for the encoding.
So there are already some implementations and I don't want to frequently change it. So they're going to come in a batch but there are some proposals for improvement by Axic and Piper and the GitHub issues in that research for some optimizations those are coming. And also, there are aesthetic changes coming to the doc.
The focus right now is tests.
Both basic like, you know a single account, you know different kinds of accounts different edge cases. And also real block tests. State roots tests are very simple just the name of the tests sort of the what kind of tests it is, maybe a basic test or block real-world block test.
And then merkleization is important as well to actually have the the so but I'm trying to make it agnostic that has so if as the format change whether it's a big overhaul or just small optimizations, it'll be nice. We would have to rewrite all the tests. So right now the architecture is a filler so we have I have a higher level sort of language for specifying the tests and then I fill the tests which with whatever the current spec is.
And so the the test is sort of encoding agnostic as long as it gets you know filled by something that's updated. And so at one thing I want to talk about is you know, as there are changes there can be still be big changes and there could be competing witness specs whether merkle tree or other or other crypto and I don't want to overpower those I want to I just want to have something that's stable that works that has reasonable properties that we can measure, you know things about it and then and then competing proposals, hopefully there will be some good something good but if not it's okay, maybe we can compare them measure them against each other.
So I think the tests is agnostic format-agnostic an important thing to work on is people start implementing, it's nice to implement against a test suite. And so that's it for the update. So not much a lot of discussion and then a lot of progress on tests coming soon. I'm gonna I'm gonna I already published some links but the merkleization part is is the tough one to fill the test with the with the correct merkle root.
So but I remember you computing them correctly. I just have to make the build out all the interest. So that's it. I'm open for discussion or we can move on to the next step.
Yes, so thank you very much for but so this is great and as you mentioned the there will be a frontal proposals coming and I unfortunately I didn't manage to get any time into this but as I mentioned there the the I was thinking about.
Basically gradually changing the witness format in the in a way of simplification. I hope which is basically key value yeah instead of the basically I would like to remove this sort of language with the opcodes and replace it instead with the with essentially sequence of key value pairs where the values could be either account leaves or storage leaves or the intermediate hashes and essentially they are presented in the sorted sequence of keys where you can apply some sort of optimizations where the education keys are.
Kind of similar to each other but essentially what you don't say in this witness format, you don't say how you merkleize this. it's sort of implied whether you are with your Patricial merkle trie, or binary merkle trie, or whatever so essentially you just given this list of things and it contains intermediate hashes, but then the whole computing the structure isn't using implicit rather than explicit in the current format, which I think we should make it much more elegant and simple and would also be a more.
Easier to convert to the binary. because basically you won't need to do any changes at all in the format that you want to convert to binary form.
but yes, so I unfortunately haven't one time to work on this yet yeah but I actually have one question about this. I just was thinking that if you have a binary tree versus hexagree this intermediate hashes it will be different right
so you just of course yeah yeah the value will be different but the structure will be the same basically
-- yeah, but you still need very soon that's tells that it was a it was generated from a hex,
--oh yeah yeah, so I think you can have like some.
Sort of we have a version byte and then we will be like some sort of another byte which tells you what is the way of merkleization so yeah we can code it up in a header of the obvious thing, but then you will figure out using looking at the header, you know, which algorithm to apply to this stuff yeah.
Alexey I'm actually I'm looking forward to it. I think you still need to specify some structure some structure of the tree you can't just provide leaves because yeah structure depending on what the rest of the state
yeah just they just have to sit down and write it down because I I do I'm pretty sure like 99.99% okay there will be structure right, but anyway, so I will because I actually have a code inside through yet which does this kind of things but I just have to put on a big.
Yeah. Um, so maybe by the time we have our next call there might be something that we can look at um, there's the beginnings of something to say like this might be a better you know, an alternate witness but format that it works.
Yes, definitely yeah. I'm gonna make a note of that and force myself to do it
The reason that I kind of want to say like if we're gonna do that, let's go for it is that.
Where like like Paul said there are some implementations. I know that Jason on my team started playing around with starting an implementation a while back and I wouldn't be surprised if we're you know, it's very soon around the corner from starting to get witnesses and like very very preliminary witnesses flowing around and as soon as that happens there starts to be like dependency change built up and ideally we start with the version that we end with or at least some version of the version that we end with to be great to.
Be able to kind of look at the two paths sooner than later.
To be honest. I'm not too worried about this because once people started kind of dipping their chosen to this there will be very quick for them to to retool themselves into the new format so it's just that I think the the biggest thing about this learning curve is to get into this into this context.
Anything else on witnesses otherwise I think turn the floor over to Alexey to talk about turbo geth APIs and tooling that you guys have been generating.
I yeah, I have to head off by the way,
So I'm if somebody did not see I invited an agenda there's this link to this document that I wrote some time ago and I keep updating this it's about the turbo geth release and there are a couple of things which are things are relevant to this discussion is because as I was mentioning in how Paris and our meetings that the tooling is very important for for this research to be kind of more accessible and for more people to be involved and one of the problematic it's about tooling is to be able to do things like code merkleization research, replay transactions and get
Figure out like where like generate witness and stuff like this so and I hope that the release of turbo geth which will soon announce which finishing things up will help collect a lot and there are two reasons first of all, we have we decided to switch from the BoltDB to ElementDB database and although for those who don't know what's different so BoltDB is basically native go implementation, but element DB is in C.
And the reason now that the interesting thing about is it has a Python binding to it. So which means that you can take the database of turbogeth which is now going to be in ElementDB and you can open it up in Python right you can just do import ElementDB blah blah open the database go iterate through it get the data whatever you want and interestingly we we are testing it but you can also do it while turbo geth is running on it so you open it and read only mode and you kind of got a hot database while it's been accessed.
I don't know how how stable this mode is but at least you can actually shut it down and then just basically operate with the database. As you want. And that basically I think that would widen the the circle of people who would want to use this quite a lot we might as well integrate it with Trinity because then you could use some of the code to execute the EVM things and just to just feed the data from the database things like this
and the second reason why I think it will be very appealing is because we have managed or by introducing a new completely new architecture of the sync we've managed to massively reduce the sync time for the what people call archived node, but I call it like a node with the with the full history.
And if you have a basically addition machine which is well, I'm contesting it on nook until now with the with the basically NVE drive so you pretty much can sync in two days essentially work about 50 hours and you basically get the entire the entire history, skip query entire pack it's indexed, so pretty much everything that you can do with the Go Ethereum or open Ethereum, whatever it works and but at the much more smaller cost in terms of the disk space and you would get an accessible format and you can.
Sync it up in two days which means also that you'll get got something wrong you can resync them today, so I'm that basically helped me to I just keep resyncing things like all the time when I when I tested it just was so great to be able to do that rather than before that I have to do it in three weeks three weeks, so these are the two things that I would point out and I hope that we will be able to open this things up to more people because we get them very close.
I know that from our perspective from my perspective and maybe this exists but do you guys have a like some version of like documentation or write up on kind of your flat database implementation, yeah,
well not basically my my my my plan is to start documenting the data model because as I mentioned in that link in order to go trough the database natively to Python you need to understand the data, although you need to know how things are light out and it's just going to be there is a beginning of this documentation, but we need to be expanded.
The the place that I was kind of like stabbing out there is that we see like the Trinity code base at some point here soonish needs to start it's transition to a flat database layout and having some like cues and write ups of those who have gone before us would be really nice as a like don't make the same mistakes that's sort of oh yeah definitely.
I think that will be very useful because these things took literally two years to figure out
and I don't feel like figuring it all out again on our own so.
Um, cool anybody got questions about that or is there anything else that you want to go into on on those things Alexey
no no. I'm just I had enough of a plug that's like into my products.
Um, all right then I'm gonna transition us into what is our last item here and this is me doing some thinking on so one of our blockers for binary tree was state sync. currently state sync is historically been based on `getNodeData` the assertion is that under binary trees the getNodedata based approach gets worse.
To the point where it may be isn't even viable anymore. does that match other people's understanding of the topic am I off base there?
I mean, you can do some tricks to to to negate this right you can start mentioning things around, you know,
I did I guess I guess the thing is like, maybe it could be made to work with `getNodeData`, but given that we you know have this sort of growing dislike forget no data the, Improvement will
I would say that we we need to remember in remind ourselves, what are they real problems with it so sort of keep because the general dislike is not the good reason to okay,
that's fair we do it yeah the problems being that it is essentially ties you to a certain type of database layout which were wanting to let people move away from and that.
the the biggest problem is that yeah the way that it's addressing the content is is is is a problematic because it uses the the hash of the sub tree is the address and that's nothing else you have no clue about where this data should be taken from which basically forces you to have the database also have in the same content addressing so that's my biggest problem with it.
I haven't done at least not in the lab, like I looked at the snap, um. Stack a while back but I haven't looked at it in the last few weeks I'm not I yeah
I looked at the snap the way like the day that Peter started publishing the stats and I did have a quick exchange with him on Twitter and I did figure out that it does have the same problem as the getNodeData at this for two parts and I asked him whether you consider adding some extra addresses, which obviously expand like they they increase the traffic but they would allow that would allow us to implement it because otherwise we can't implement snap either because it basically repeats the same mistake of using hashes that you only address for certain things.
but he told me that it's not the second stone and we can definitely you know change out the spec and figure it out so I'm I'm having a hope that we can do that
That leads right into the kind of like ideal scenario that I'm looking at which is we've already got a lot of work done by the geth team for snaps snaps his shown that it's a pretty like.
Happy sync protocol. It yields major like order of magnitude level or more improvements on current fast sync. And so while I think that there still is like an iteration to be done here at some point for this kind of like merry-go-round shaped version that gets us maybe another like major step of gains by taking advantage of things with network level that we could.
Switch over like essentially piggyback on top of snap as the way to do binary transition and and for some context there the idea is that some clients. Turbogeth, probably geth will be able because they've done the work to flatten their database, they'll be able to maintain maybe two versions of the same database one binary one hex for some period of time and the idea is that for the transition from hex to binary we literally just flip the switch at a block and all of the clients and things who don't have the ability to to have both at the same time just rely on essentially re-syncing once that block happens and,
This new sync protocol that's fast and efficient is kind of like the thing that supports the network under that trade over. So there's a lot of stuff in that that to unpack and and we can debate any of it right here, but the general idea is that we need a sync protocol that's going to work both past the transition of the binary tree and in theory as a mechanism for clients to sync the binary tree if they're not able to build and maintain both concurrently.
Yeah, that's a that's a great summary yes, that's exactly my understanding of this.
I'd be curious if you're able to do kind of a like even if it's just in one of our channels or in the sync channel kind of a breakdown of like where you see like if we were to you know, What are the things about snap that that need to be changed if we want to leave behind this kind of old addressing models for the database.
Okay, yes so far I had the only other than divided things that would make it impossible to implement in turbogeth but I haven't actually thought about whether it will be compatible or practical with the binary trees, but I will look at it again. In this from this two points of view now
because if we can that be a big win we get to scrap or delay table whatever this kind of idea of MGR while I know I don't
I don't I wouldn't scrub it because what I think is
Piper:I'm not saying scrap it I'm saying I'm saying that we can still look at it we can still work on it and develop it but it stops being absolute critical path and it becomes a like this is this is you know that the ideal option but we have this other pragmatic option that we correct.
Yeah, no this is this is great because from for me essentially that what the snap no numbers demonstrate is basically how crap the the current algorithm is, right? That's basically what a demonstrates.
but it does not demonstrate where we could be I think there is much better way of doing it and after looking at the Peter's numbers like wow, but if we can do even better than that by what are we talking about a couple of hours in basically you're all synced?
Exactly which is where I was like kind of half a surfing the actual asserting that there's still another order of magnitude to be gotten like out of sync if we relate definitely data yeah.
So what I like and what I'm seeing and what I'm kind of trying to do is gather up like this is the like most pragmatic way that we could do things this is sort of like the ideal and to build out this spectrum of approaches for how we're going to tackle each of these things so that we have fall back options.
When something gets complicated when something gets delayed when an unknown unknown shows up a nice haircut, when an unknown unknown shows up or whatever that's that, you know throws a wrench in our original plans so we can if nothing else worst case. I don't know what else to call it like fall back under one of these slightly dirtier slightly not as ideal options, but still gets us to this like end goal that we're trying for it, so
we've got to be anti-fragile yes.
And chaos monkey something.
No no you it's not a you you have to be intentionally you have to intentionally build your strategy as being anti-fragile then because it doesn't just it's not an emerging property it has to be consciously built up like that. All right, um, so
um who spoke up earlier, um, Somebody mentioned earlier that they were that they were indeed specifically working on the binary treatment transition spec, etc. I'd love if by our next call we could have something a little bit more formal in place there if that is indeed the case, um, Don't remember who it was the guy's name was Guillaume.
I don't know he's in the calls
okay, but okay, well then then maybe we can maybe I can try to poke him offline or over discord and see if it's possible for us to have anything tangible to look at by our next call code merkleization, thank you all for the work on that and I and I really think that we're at the point where we could start at least a preliminary like getting some things written up so we've got a starting point.
Same with the EIPs for meta transactions and things Matt thanks for that feedback on whether or not we're working on the right. The most important part of that. So I'll be looking at that. And yeah, Alexey I think we're looking at alternate witness spec by next call so that we can start looking at the two things side by side and that is becoming like loose dependency of things like the EIP for code merkleization and at some point just general witness passing.
Things are starting to come together. I mean, it's the roadmap. It's roughly the roadmap that we worked out a while back but everything is slowly taken form. Anybody want to throw anything in before we before I put a lid on this and call it a day?
Cool. Thank you all for your work on this. It's really nice to see this coming together and whatever this hard fork is that comes up after Berlin. It you know, I don't want to throw something in just for the sake of throwing something in but I think that it'll be a.
Big win if we can actually get something here eat up included and like deliver a piece of this road map to get some momentum, you know. Transitioning from this kind of research to implementation phase. So, All right, um, thank you everybody have a good rest of whatever day it is for you.
Whatever time of day, et cetera see y'all later on. Yeah, yep.
Thanks Yeah. Bye everybody.