HackMD
    • Sharing Link copied
    • /edit
    • View mode
      • Edit mode
      • View mode
      • Book mode
      • Slide mode
      Edit mode View mode Book mode Slide mode
    • Note Permission
    • Read
      • Only me
      • Signed-in users
      • Everyone
      Only me Signed-in users Everyone
    • Write
      • Only me
      • Signed-in users
      • Everyone
      Only me Signed-in users Everyone
    • More (Comment, Invitee)
    • Publishing
    • Commenting Enable
      Disabled Forbidden Owners Signed-in users Everyone
    • Permission
      • Forbidden
      • Owners
      • Signed-in users
      • Everyone
    • Invitee
    • No invitee
    • Options
    • Versions and GitHub Sync
    • Transfer ownership
    • Delete this note
    • Template
    • Save as template
    • Insert from template
    • Export
    • Google Drive Export to Google Drive
    • Gist
    • Import
    • Google Drive Import from Google Drive
    • Gist
    • Clipboard
    • Download
    • Markdown
    • HTML
    • Raw HTML
Menu Sharing Help
Menu
Options
Versions and GitHub Sync Transfer ownership Delete this note
Export
Google Drive Export to Google Drive Gist
Import
Google Drive Import from Google Drive Gist Clipboard
Download
Markdown HTML Raw HTML
Back
Sharing
Sharing Link copied
/edit
View mode
  • Edit mode
  • View mode
  • Book mode
  • Slide mode
Edit mode View mode Book mode Slide mode
Note Permission
Read
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
Write
Only me
  • Only me
  • Signed-in users
  • Everyone
Only me Signed-in users Everyone
More (Comment, Invitee)
Publishing
More (Comment, Invitee)
Commenting Enable
Disabled Forbidden Owners Signed-in users Everyone
Permission
Owners
  • Forbidden
  • Owners
  • Signed-in users
  • Everyone
Invitee
No invitee
   owned this note    owned this note      
Published Linked with GitHub
Like BookmarkBookmarked
Subscribed
  • Any changes
    Be notified of any changes
  • Mention me
    Be notified of mention me
  • Unsubscribe
Subscribe
<h1 style="display: none;">Hivegrid Spec</h1> <img src="https://storage.googleapis.com/ethereum-hackmd/upload_0148b70529d1db33bc2eb3113e277668.png" width="600" alt="Hivegrid"> ## What is Hive? Before diving into hivegrid we will go over some preliminary hive information! Hive is an end-to-end test harness used for integration tests against ethereum clients. It generally works by spinning up 3 types of docker containers working alongside the hive controller: 1) **Simulators** - within hive, tests are grouped by test suites which are grouped by simulators. For example the tests that are specific to the engine api for cancun are grouped within the `engine-cancun` test suite, which is held within the `engine` simulator. Tests are ran within their corresponding simulator container and interact with clients using RPC or p2p. - **Containers must be rebuilt upon changes to the simulator or tests within the simulator.** - **Simulator runtime is usually longer than the build time if there is more than 50 tests.** 2) **Clients** - client containers simply contain a single client built with a specific configuration. As before clients can communicate with simulator containers using RPC. - **Containers are rebuilt when changes to the client branch your are pointing to are made.** - **Build times can be long depending on the client.** - **Multiple client containers can exist in one test.** 3) **Hiveproxy** - within this container, a proxy server is run to facilitate communication between simulators and clients (inter-container communication). - For example when sending an RPC request to a client from a simulator, or recieving a response from a client to simulator, it is achievied by relaying the information via hiveproxy. - **Practically never needs to be rebuilt.** 4) **Hive Controller:** this component operates externally to the docker environment and acts as the central orchestrator for simulations. Its responsibilities include initializing simulations and managing the lifecycle of both simulator and client containers. - Upon initiating a simulation, the hive controller starts an API server for the simulator. - The `HIVE_SIMULATOR` env var directs simulators to the URL for the API server. - The server allows simulators to launch client containers and report results back to the hive controller. **From the existing [documentation](https://github.com/ethereum/hive/blob/master/docs/overview.md):** ![](https://storage.googleapis.com/ethereum-hackmd/upload_f74ee1f1fe9db51d409a31b7b1dbcd91.png) ## Current Hive Flow Running a simulator with hive involves several coordinated steps between the hive controller, simulators and clients. Here is a high level breakdown of what happens when you run a simple command: ``` ./hive --sim ethereum/engine --client go-ethereum --sim.limit <test_suite_regex>/<test_name_regex> ``` 1) **Hive Controller Initialization:** - Executes directly on the host system, initializes and begins the simulation process. - Reads and stores the cli params to determine what simulators and clients are involved for the the simulation. 2) **Docker Image Preparation:** - The hive controller instructs the docker daemon of the host system to build or pull the required images for the specified simulators and clients using an internal docker API (`internal/libdocker`), in the following order: - Hiveproxy image: built first as it ensures communication between the docker network, - Client images: all the client/s specified are then built, - Simulator images: followed by the simulator/s specified. - **Note if nothing changes between simulator runs, the cached docker images held within the docker daemon of the host are used and not rebuilt.** ![](https://storage.googleapis.com/ethereum-hackmd/upload_9208f317a25f68f98f3f31d2b040d3e9.png) 3) **Simulator Commencement:** - API Server Startup: the hive controller initiates the API server for the simulator, which handles communications and commands between the simulator and itself. - Hiveproxy Container Launch: the hive controller then launches the hiveproxy container for relaying HTTP requests within the docker network. - Simulator Container Launch: and then launches the container for the simulator. - Note that relavant simulator cli variables stored within the hive controller are added as env vars to the simulator container. - For example, for `--sim.limit` its values are stored within the `HIVE_TEST_PATTERN` env var local to the simulator container. ![](https://storage.googleapis.com/ethereum-hackmd/upload_81100d4b0093f5f28f35a9f264fa19a7.png) 4) **Test Suite & Test Case Filtering:** - For every test suite defined within a simulator: - The `HIVE_TEST_PATTERN` env var local to the simulator container (which stores the cli value from `--sim.limit`) is used to filter the test suites to run within the simulator. - In a similar fashion, for every test case defined with a test suite we also filter using the value defined by `--sim.limit`. - For example for the `engine` simulator, if we used `--sim.limit "cancun/Blob"` we would filter to only use the test cases that have "Blob" in their name/definition within the `engine-cancun` simulator. See image below: ![](https://storage.googleapis.com/ethereum-hackmd/upload_c0123fe2ce04f5bd45fcd43d464bf8a4.png) - Note that `--sim.limit` filters using regex. If not used hive will run all tests for every test suite within a simulator. 5) **Test Case Execution:** - For each filtered test case ran within a test suite within a simulator :P, seperate fresh client container/s are launched. - These are requested by the simulator depending on the test written from the hive controller using the simulation API. - When the simulator requests a client instance, the hive controller launches a new docker container using the built client image. - Client containers are configured using env vars typically defined by the simulator, i.e `HIVE_SHANGHAI_TIMESTAMP`. - After client/s have finished starting, tests within a simulator will communicate with it/them over RPC or p2p (using the hiveproxy). - At the end of each test case the client containers will be deleted, see below. ![](https://storage.googleapis.com/ethereum-hackmd/upload_d1c7f4d98fa3416dfd063df5dc116d88.png) 6) **Logging During Simulator Runs:** - Both simulator and client containers generate logs as part of their operation. These logs can include details about the test execution, interactions (like RPC calls) and errors. - Currently the hive controller mostly asyncronously dumps the logs for each test case run within `./workspace/logs/`: - **client specific logs** for each test case run are stored within their own dir: `./workspace/logs/<client_image_name>/` - with the naming convention: `<unix_timestamp>-<client_container_id>.log` - **simulator specific logs** for each simulator run (filtered by test suite and test case) so essentially its one big log file: - with the naming convention: `<unix_timestamp>-simulator-<simulator_container_id>.log` - **integrated test suite/clients logs**, for each simulator run, multiple big files stored within `./workspace/logs/details/`: - with the naming convention: `<unix_timestamp>-<simulator_container_id>-<test_suite_index>.log` - for example if you are running 3 test suites within a simulator you will end up with 3 of these logs. - **simulator meta logs**, for each simulator run this is created with relavant structured information per test suite and test case, an example snippet for a simulator ran with 1 test suite: ``` { "id": 0, "name": "engine-cancun", # test suite "description": "\tTest Engine API on Cancun.", "clientVersions": { "ethereumjs_cancun-git": "9.8.1" }, "testCases": { # test cases within the test suite "1": { "name": "engine-cancun test loader", "description": "", "start": "2023-11-06T19:40:42.088327665Z", "end": "2023-11-06T19:44:09.666043841Z", "summaryResult": { "pass": true, "log": { "begin": 34165000, "end": 34165016 } }, "clientInfo": null }, }, "simLog": "1699299641-simulator-9bebca1df2edf2e768b8d89e773a7b469f451c376e18dda88b2d926995624aa1.log", "testDetailsLog": "details/1699299642-9bebca1df2edf2e768b8d89e773a7b469f451c376e18dda88b2d926995624aa1-0.log" } ``` - `"id": 0,` corresponds to the `<test_suite_index>` for integrated test suite/clients logs. - `"name": "engine-cancun",` the test suite. - ` "testDetailsLog": "details/1699299642-9beb...4aa1-0.log"` the integrated test suite/clients log. - Note the values within `summaryResult/log`: - currently hiveview uses these to find the location relavant to a specific test case within the integrated log. This needs do be improved. 7) **Post Simulator Run Steps:** - After a simulator has finished executing all test suites with their respective test cases, the simulator container is deleted followed by the hiveproxy container. ![](https://storage.googleapis.com/ethereum-hackmd/upload_00907d9eef32b9f89831d550f7facbd8.png) - Note the client containers are deleted after each test case run, so a fresh and new container is used for each test. - The container deletion is performed inversely compared to the order containers built and launched. ## What is Hivegrid? Hivegrid is exactly what the name implies: a grid/cluster of hive instances/nodes. 1) Its core or primary usecase is essentially an orchestrator CI/CD system. This would act as an overhaul to the existing system. One that is more maintainable and gives client teams more control & value. 2) The secondary use of hivegrid is simply to improve the run time and UX of running a hive simulator, where anyone can run their hive tests within the grid instead of locally. - Lets say we want to run all the test suites (and test cases) within the `engine` simulator using the grid! Instead of running `./hive ... --sim ethereum/engine` we would run `./hivegrid ... --sim ethereum/engine`. - Let's assume we have 10 hive instances within the grid. Currently we have over 200 `engine` tests. Hivegrid would split these tests to run on multiple nodes depending on demand, lets say 4 nodes are used. - In this case we run 50 tests on each hive node, where each node uses parallelism. Theoretically this has the capability to massively improve the run time when running hive tests. Currently it takes over an hour run only the `engine-cancun` tests locally with no parallelism. <figure> <center> <img src="https://storage.googleapis.com/ethereum-hackmd/upload_b1a63ae78062e0bd58d0ca96c50d8e88.png" width="900" alt="Hivegrid"> </center> </figure> ## Existing Hive CI For the current hive CI we run multiple isolated [hive servers](https://github.com/ethereum/cluster/blob/master/hive/ansible/inventories/production/inventory.yaml): - **EL Cancun tests:** https://hivecancun.ethdevops.io/ - **Interop EL/CL Cancun/Deneb tests:** https://hiveinterop.ethdevops.io/ - **Other EL tests**: https://hivetests.ethdevops.io/ - Should be mainnet tests? Clients need updated/fixed, not currently maintained. - **EL/CL mix**: https://hivetests2.ethdevops.io/ - Combination of EL/CL tests? Clients also need updated/fixed, not currently maintained. Within each server we enable a simple [service](https://github.com/ethereum/cluster/blob/master/hive/ansible/roles/hive/templates/hive.service.j2) that simply runs a [bash script](https://github.com/ethereum/cluster/blob/master/hive/ansible/roles/hive/templates/start-hive.sh.j2). This script runs specific hive simulators within an inefficient loop, repeating test runs over and over. - *One potential hacky solution to the loop is here: https://github.com/ethereum/cluster/pull/881, but hivegrid is the optimal solution.* Each simulator run within the loop dumps its results within a specific directory. [Hiveview](https://github.com/ethereum/hive/tree/master/cmd/hiveview) a simple front-end for viewing results within ### Currently we have been running hive tests on the heztner server we were given a while back. We now have a playbook to add and remove users such that they all have separate docker hosts and don't interfere with each other during test runs. I was thinking about opening the server up to other client teams to use for debugging if its useful for them - the main benefit being faster test runs. However if the latter became popular it would be difficult to scale, and its a bit too hacky for me. The best scalable solution in my mind is to create a grid of hive instances. Lets call it hivegrid, for now with 10 instances/nodes/hives. These could be used to run hive as one would locally. So instead of running ./hive .. you would run ./hivegrid .. specifying that you would like to run the tests on the grid. Theoretically this could make for blazing fast test runs. Especially if we create a fancy auto-updating caching system for client containers. Maybe some instances are specific to each client. Lets say we wanted to run all the engine tests (there are over 200), depending on hivegrid load we could allocate all 200 engine tests between 4 instances. So thats 50 tests per instance. Each instance could run with parallelism set to 8. If the load is higher due to more users running hivegrid we allocate less instances etc. Similarly, I've been thinking a lot about future CI/CD for hive, and I feel this could integrate quite well with it. Essentially we would have specific nodes allocated solely to CI. This would make it very easy to for client teams to add to there CI/CD. Assuming we allow for different log outputs/formats it would be as simple as adding the appropriate ./hivegrid command to a client CI. There are still some more improvements needed for hive before we get to anything like this (especially documentation), however its something I'd be keen to try working on early next year if its feasible. 😄

Import from clipboard

Advanced permission required

Your current role can only read. Ask the system administrator to acquire write and comment permission.

This team is disabled

Sorry, this team is disabled. You can't edit this note.

This note is locked

Sorry, only owner can edit this note.

Reach the limit

Sorry, you've reached the max length this note can be.
Please reduce the content or divide it to more notes, thank you!

Import from Gist

Import from Snippet

or

Export to Snippet

Are you sure?

Do you really want to delete this note?
All users will lost their connection.

Create a note from template

Create a note from template

Oops...
This template has been removed or transferred.


Upgrade

All
  • All
  • Team
No template.

Create a template


Upgrade

Delete template

Do you really want to delete this template?

This page need refresh

You have an incompatible client version.
Refresh to update.
New version available!
See releases notes here
Refresh to enjoy new features.
Your user state has changed.
Refresh to load new user state.

Sign in

Sign in via SAML

or

Sign in via GitHub

Help

  • English
  • 中文
  • 日本語

Documents

Tutorials

Book Mode Tutorial

Slide Example

YAML Metadata

Resources

Releases

Blog

Policy

Terms

Privacy

Cheatsheet

Syntax Example Reference
# Header Header 基本排版
- Unordered List
  • Unordered List
1. Ordered List
  1. Ordered List
- [ ] Todo List
  • Todo List
> Blockquote
Blockquote
**Bold font** Bold font
*Italics font* Italics font
~~Strikethrough~~ Strikethrough
19^th^ 19th
H~2~O H2O
++Inserted text++ Inserted text
==Marked text== Marked text
[link text](https:// "title") Link
![image alt](https:// "title") Image
`Code` Code 在筆記中貼入程式碼
```javascript
var i = 0;
```
var i = 0;
:smile: :smile: Emoji list
{%youtube youtube_id %} Externals
$L^aT_eX$ LaTeX
:::info
This is a alert area.
:::

This is a alert area.

Versions

Versions and GitHub Sync

Sign in to link this note to GitHub Learn more
This note is not linked with GitHub Learn more
 
Add badge Pull Push GitHub Link Settings
Upgrade now

Version named by    

More Less
  • Edit
  • Delete

Note content is identical to the latest version.
Compare with
    Choose a version
    No search result
    Version not found

Feedback

Submission failed, please try again

Thanks for your support.

On a scale of 0-10, how likely is it that you would recommend HackMD to your friends, family or business associates?

Please give us some advice and help us improve HackMD.

 

Thanks for your feedback

Remove version name

Do you want to remove this version name and description?

Transfer ownership

Transfer to
    Warning: is a public team. If you transfer note to this team, everyone on the web can find and read this note.

      Link with GitHub

      Please authorize HackMD on GitHub

      Please sign in to GitHub and install the HackMD app on your GitHub repo. Learn more

       Sign in to GitHub

      HackMD links with GitHub through a GitHub App. You can choose which repo to install our App.

      Push the note to GitHub Push to GitHub Pull a file from GitHub

        Authorize again
       

      Choose which file to push to

      Select repo
      Refresh Authorize more repos
      Select branch
      Select file
      Select branch
      Choose version(s) to push
      • Save a new version and push
      • Choose from existing versions
      Available push count

      Upgrade

      Pull from GitHub

       
      File from GitHub
      File from HackMD

      GitHub Link Settings

      File linked

      Linked by
      File path
      Last synced branch
      Available push count

      Upgrade

      Danger Zone

      Unlink
      You will no longer receive notification when GitHub file changes after unlink.

      Syncing

      Push failed

      Push successfully