Proposal - Monitoring Tool to Help Identify and Mitigate Recirculating GIVbacks

I’m thrilled to see this proposal finally hit the forum! How will the final product take form for the Giveth team to utilize? Will it be able to be managed by non-developers? How customizable will it be? For example the list of verified projects in our database is ever-changing - Will Giveth be able to modify this themselves or will we require TrueBlocks to manage the tool?

4 Likes

Happy to see this proposal and to support it! Thanks @tjayrush :slight_smile:

In addition to Mitch questions I’d like to know if multichain support integration on Giveth would be charged additionally? I understood that multichain support to Trueblocks is something you will be adding regardless, but I wanted to clarify if any integration to future chains on “our side” would be treated as a separate work?

Also, would we be able to give that final product to other projects to use in their environment? Maybe that’s not even possible so pardon my ignorance.

Thanks again!

2 Likes

My preference would be for your team and ours to work together to stand up a system that works independently of us. In other words, something that can be “installed and just works.”

The first deliverable would be a ‘data scraper’ that extracts all the available on-chain data (transactions, logs, neighbors, balances, etc.) about whatever set of addresses you provide (if you have an API listing the addresses, we can read directly from there).

Other than making sure it’s continually running (which is a DevOps issue), my hope is it would require little-to-no management.

If the list of addresses you’re interested in comes from an API, then this would be a non-issue. If the list is not an API, then a simple .csv text file would suffice.

As far as customizing the output of the scrape, that is also possible. This would be a ‘configuration’ issue. Changing that configuration–currently–would involve a developer editing a config file.

Summary:

My goal is to make something that can be “installed and just works,” so you don’t have to do anything (other than making sure it stays running). It’s important to note, that we’re proposing a solution that runs on your hardware. We do not provide this as a service. For this reason, and in answer to Marko’s question below, what we’re building is something that any other project (such as a DAO, other donation sites, etc.) with similar needs can stand up a “monitoring” system without relying on us.

The first deliverable would ‘scrape’ the data and ‘put it somewhere.’ ‘Somewhere’ being wherever you want it to go: a database endpoint, flat .csv files on a server, under the carpet…literally anywhere. You can help us understand where you would want the ‘scraped’ data to go.

The second deliverable of the project (after consultation with you’all and some data designing) would be to figure out how best to analyze the data to meet your needs. There’s a link in the proposal where I started discussing this issue.

Thanks for your questions. If you have any more, please ask. Happy to clarify anything.

3 Likes

Hi Marko,

Multi-chain support is included. We have some work to do to finalize support for multiple chains, but we added rudimentary support recently. That work needs hardening/testing to make it production-ready. This hardening/testing is part of the proposal.

Concerning your second question about more broad usefulness for other projects–the answer is “yes.” Please see my answer above to Mitch’s question where I detail this a bit more.

1 Like

Thank you, it’s perfectly clear now. No further questions for now.

2 Likes

I have seen what this tool can do… its amazing! Thank you so much for your collaborative efforts this far and for all of the work that your team has already done on this front.

Super happy to see this proposal up in the forum and I am looking forward to a time that recirculation review using human tracking and spreadsheets is a thing of the past.

3 Likes

Hi. So I’m wondering what the next steps are.

I found this: Governance Process.

This is now true:

Proposals must remain on the forum, open
for Advice Process, for a minimum of 5 days.

It also says:

proposals can move on into either the GIVgarden
or the rDAO DApps for voting

Is this something I do or someone else?

I’m ready to get started in earnest.

Thanks for your help.

Yep, it will be useful for us to state somewhere for the GIV tokens you’re requesting (The GIVgarden only disburses GIV) - If we take the $30,000 USD (30,000 DAI) in your proposal at the current price of GIV we arrive at 67419.138948269 GIV requested.

You can proceed to the next step by creating a proposal on the GIVgarden

Feel free to DM on discord if you need further assistance!

3 Likes

Okay. Proposal submitted to the Giv Garden. Gardens.

Really smooth experience. Kudos to that.

2 Likes

You deleted it cause it was set to use wxDai… when will you repost?

I’m trying to repost now. The 5,000 deposit from the previous post seems to have come back into my account (I can see it at the top of the GivGarden), but when I go to create a new proposal, it says I don’t have enough Giv tokens.

At the top of the page:

image

When I go into create a new proposal:

Thanks for everyone’s patience. I had to remove the previous proposal and make another one. Here’s the new proposal: Gardens

I’d really love your support if at all possible.

1 Like

Not sure of the protocol here (sorry if I’m violating it), but I wanted to bump my proposal which appears to have one more day of voting. Here’s the proposal: Gardens. We’d love your support if you’re willing to give it.

Hello all,

I would like to formally thank the Giveth community for awarding the TrueBlocks team a grant [above]. We look forward to working with you to develop a system to identify recirculating GIVbacks.

While we’re just getting started, I wanted to update you on our progress thus far and discuss the next steps.

As a reminder, our proposal puts forward three primary steps:

  1. Work with the Giveth team to better understand the recirculation problem in order to identify ways to programmatically address the issue.

    Update as of April 1, 2022:

  • During the proposal process, we joined in on a few calls with team members and participated in two Friday Fraud Review calls. At this point, we feel we have a better understanding of the problem and how it’s currently being addressed through a manual process. This will inform how we can automate this process.
  • Our understanding of the problem will grow over time, but at first blush, the problem seems relatively straightforward, so we can proceed to the next step.
  1. The second part of our proposal was to complete a “port” of TrueBlocks to the Gnosis Chain.

    Update as of April 1, 2022:

  • We’ve made good progress on this task even while waiting for this proposal to pass. We now have a feature called “Multi-Chain Support” (including Gnosis chain). We have a currently-working version of TrueBlocks running against the Gnosis Chain on our local systems. This work is not complete, however.
  • The portions of the work that are not complete are:
    • Testing – our Gnosis Chain port is working, but not well tested. We’re actively building more robust testing.
    • Documentation – if, as is our hope, Giveth (and other projects) run their own TrueBlocks instance (self-hosted on a dAppNode, for example), there needs to be better documentation. See here for current progress: Multi chain - TrueBlocks.
    • Publication of the Unchained Index – A very important part of TrueBlocks is what we call the Unchained Index. This is a method by which we not only create an index of “every appearance of every address anywhere on a chain,” but wherein we publish access to that index in a way that makes it impossible for us to ‘withhold’ or ‘censor’ it. This feature (the Unchained Index) already works on Mainnet Ethereum, and there is some work involved in bringing this feature to Gnosis.
  1. The third part of our proposal was to (a) stand up a monitoring/data pipeline process for Giveth against both the Gnosis Chain and Mainnet Ethereum Chain. Part of that work was to help the Giveth team stand up their own pipeline in the future so that they can independently maintain their own monitoring; (b) identify and collect relevant grant recipient addresses to do the analysis; (c) write the code needed to automate the identification of potential recirculating grant funds, (d) analyze the results and iterate.

    Update as of April 1, 2022:

  • During the proposal period, we expanded on existing work we had done on the GitCoin ecosystem (https://tokenomics.io/gitcoin). We will be standing up a similar system for Giveth on our servers next week. We also hope to have further discussions with the Giveth DevOps team to ensure an anticipated integration with existing systems. Those conversations can begin now.
  • Once this monitoring system is standing up on our servers, we can start analyzing the data to effectuate the solution. We’ve mentioned in the past that we’ve worked with a great data scientist (who we’ve worked with at Gitcoin–his name is Richard_). He would like to participate in this Giveth work and potentially become a contributor to the DAO. I’m in touch with hanners717 in your group regarding this.
  • We anticipate iterating with your team on the specifics of the data analysis, the output data generated, and how the ongoing monitoring system might work in the future.

Summary: We’re ready to start working on this project in earnest next week. Scraping and monitoring against both Gnosis Chain and MainNet Ethereum are “mostly” working but need refinement (an ongoing process). The “data pouch” (https://tokenomics.io/) will provide similar data to the GitCoin site and should be in its first version in a week or two. The process of finding a good data scientist to help us analyze the data has begun.

Thanks for reading.

If you have any questions or would like to have a call to discuss, please let me know. I hope you find this helpful or if you have any suggestions on how to improve our progress reporting, please let us know. Our plan is to provide a formal update like this every other week.

Post Script: My co-worker, Dawid Szlachta, who’s been working with us for about a year, will be helping with this work. He’s an excellent resource and is available on our Discord.

1 Like

Hello Giveth Community,

Since our last update, we’ve been very busy. Please see the original proposal for more information.

The original proposal had three parts:

Part I – Discussions Pertaining to the Recirculation Issue

This part of the project, as previously reported, is complete.

Part II – Support for Gnosis Chain and Creation of Tokenomics Data Dump

The second part of our original proposal had two subtasks. The first subtask was to extend TrueBlocks indexing to the Gnosis chain. The second subtask was to use that new capability to create a Tokenomics Data Dump of the entire Giveth ecosystem.

Part II-Subtask 1. Support for Gnosis Chain

This subtask is complete, and, as we expected, was easy to extended to other EVM-based chains. In the future, if Giveth were to support other chains (such as Polygon), our indexer will easily work with those other chains.

We’ve been running the TrueBlocks indexer on both the Ethereum mainnet and Gnosis since early April. Henceforth, anyone with access to the UnchainedIndex smart contracts (which is everyone with an RPC) can retrieve the chain’s entire index without permission (from IPFS).

We’ve also released a pre-alpha version of a dAppNode package that uses the UnchainedIndex to build a local copy of the index. In the future, we would like to see Giveth (and other projects) run this dAppNode package. This will provide Giveth with unprecedented decentralized–and super fast–access to its own data.

An official version of the dAppNode package should be available shortly (Note this work will be completed on our own time, not Giveth’s).

Part II-Subtask 2. – Enable a Tokenomics.io Data Dump Website

The second subtask of part II of our proposal required the first subtask. With multi-chain support, we can now build a Tokenomics Data Dump website for Giveth (Data Pouch - Version III).

This subtask is “complete enough” (that is, it’s far enough along to move on to the primary project of identifying recirculating donations).

We spent much of our time this month on this subtask. This demonstration website now includes a full data extraction of every one of the nearly 1,260 addresses on Giveth’s purple list (although only about 120 addresses have activity).

Please note the unique nature of this dataset. Most datasets one sees from an Ethereum project are either (a) generated by an ad-hoc process (many of which rely on pay-wall gated APIs), or (b) use theGraph (which is free for now, but won’t be forever). Regardless of the method, neither of those two processes produces a dataset similar to the one presented here. The unique aspect of the TrueBlocks dataset is that it includes full transactional histories of every involved address. (We call this Ecosystem accounting).

This dataset includes not only transactions related to the Giveth, but all transactions of all the addresses that have ever received donations on Giveth. This depth of detail is required for the third and final part of the project.

How Does TrueBlocks Work

A short diversion to help explain what happens under the covers.

TrueBlocks creates “monitors” which can be run however often as you wish (in Giveth’s case, we run them every five minutes):

These monitors read a simple list of addresses (in Giveth’s case, a list of addresses we’ve scraped from your purple list).

Given this list of addresses, TrueBlocks watches for new transactions.
Whenever an address transacts on-chain, TrueBlocks pulls seven different types of data from that transaction. (This list of data types to pull is also customizable per project.) In Giveth’s case, we pull…

…appearances, transactions, logs, balance histories, neighbors (more below), etc. and then we combine and compress the data. (Data types are defined here: tokenomics.io/giveth/exports at master · TrueBlocks/tokenomics.io · GitHub)
This produces separate datasets for each address for each data type. The data structure, which is per-chain, looks similar to this:

image

This process of monitoring and extracting data from a collection of addresses is currently running on our server, and we are happy to continue to provide this service free of charge to the Giveth community, however, our ultimate goal would be for you to stand this capability up for yourselves on a dAppNode or similar system.

Summary for Part II: We’ve completed two significant portions of this work with the help of Giveth’s funding: (a) support for multi-chain, and (b) multi-chain data extraction for a large (1,260) collection of recipient addresses.

Part III – Identification of Recirculated Donations

What remains?

All of the above is pre-amble to the actual work which we are hoping to complete in the next few weeks.

We now have the ability to scrape multiple chains. We also have the ability to extract full transactional histories for a large (1,260) collection of address. And—important to us—we have the ability to do this in a fully decentralized way on a dAppNode.

We’re now ready to begin the task of identifying recirculated transactions.

We’ve written a few preliminary ideas here:
Recirculation on Giveth Platform - Google Präsentationen. This thinking is out of date. One of our tasks is to extend this document and fill in missing details.

One of our colleagues, @Richard, has done some preliminary proof of concept work using Dune Analytics. Dune.

This work uses Dune which will serve as a double check to our results. Note that while this work is fairly clear, it suffers from a few shortcomings. (1) it is not automated (the addresses need to be copied in), (2) it doesn’t scale into the full transactional history of each address. This work digs only two levels deep into each address’es history. For our proposes, we will need to dig much deeper. This does help us understand two things, though. (1) there do appear to be recirculation’s, and (2) they happen relatively shallowly in the transaction history of some address. Interesting.

We’ve also spent time working on a Dynamic Traverser. This technique is documented here: https://tjayrush.medium.com/dynamic-traversers-in-trueblocks-7e2215cb1af9.

Dynamic Traversers are extremely powerful. They allow us to efficiently produce a list of ‘neighbors’ to a given address. Not only that, we can recursively traverse the list of neighbors. It is this ability that will allow us to solve this problem.

One note: searching the transaction history of all previous senders is probably an intractable problem. The tree of historical transactions grows unboundedly (at least as far back as the first block). We have a number of ideas in mind to short-circuit this search, and will document them, to make this intractable problem tractable.

This will be our focus for the remainder of the project.

We hope to have the first working version of this ready in the next few weeks and will report back here when we have further results.

Cheers.

4 Likes

Amazing progress! can’t wait to see the final product!

2 Likes

Hello Giveth Community,

Executive Summary: It’s been a while – sorry for that, but starting next week, we’re full-time Giveth until we’re done.

Work Since Last Update

One of the biggest outstanding issues in the original proposal is the issue of “How does TrueBlocks deliver a solution without becoming a web 2.0 ‘solutions provider.’” If you’re aware of our work at all, you’re aware that we are “maxis” when it comes to the issue of decentralization.

At the end of our most recent update, our “heroes” were describing a ‘Data Pouch’ (Data Pouch - Version III) that we built. If you look closely at that site, you’ll see that the data has not been updated since May 5. In other words, our hero’s existing solution was not robust. Our goal with the “data pouch” was that it could be installed and be run unattended, continually freshening its data. That did not work as planned. Furthermore, you may notice that providing a website that delivers data contradicts the previous point I made that TrueBlocks does not want to become a “service provider.”

How do we reconcile those two seemingly opposing views?

Answer: dAppNode

Instead of solving the robustness issue with the current data pouch, we turned our focus to completing a dAppNode version of TrueBlocks based on Docker. This has the happy consequence that the same solution we provide on dAppNode (i.e. Docker) can be used on Tokenomics for the data pouch. Killing two birds with one stone, as it were.

Since we last reported, we’ve been very focused on:

  1. Building a Docker version of the trueblocks-core, and
  2. Producing a dAppNode package based on Docker.

This will allow us, in the coming weeks, to stand up a more robust version of the data pouch, and moving forward, to provide a solution to Giveth (and anyone else who’s interested) to stand up the same tool on a dAppNode – thereby finally realizing our desire to produce decentralized data access directly from the EVM client software.

Setbacks

There was another issue that set us back a few weeks during the last month. There was a fairly serious bug in Erigon. Because of the unique nature of our software (historical indexing), few others encountered this issue. While it would be better practice for us to use a more stable branch of Erigon, we’ve chosen to use their “bleeding edge.” About six weeks ago, they had a bug that our code didn’t notice until about three weeks ago. This had the unfortunate effect of injecting invalid data into our index. We’re still working on a solution to that issue, and this has taken precedence over the rest of our work. We hope to complete this “recovery” work soon.

Future Work

In the next few weeks, our focus will be:

  1. Finish docker version
  2. Use the docker version for the existing tokenomics/giveth data pouch
  • This will automate the production of the Giveth’s data
  • This will make the existing tokenomics systems more robust
  1. At the same time, write the code for the custom traverser that identifies “recirculating” donations.

    As part of this work, we will add a new column to the display showing different labels for each address. For example, “new” donors vs. “existing” donors, “grants” vs. “donors”, etc.

    Of particular interest will be the “recirculated” tag which will identify addresses that have recirculation behavior in their history.

  2. Notes:

  • The user will be able to click on a tag to filter the data. For example, clicking on …recirc… will filter out only the addresses identified as having suspicious activity.
  • The tags are “predicates” (either true or false), so it will be easy to any number of additional tags that may be of interest to Giveth.
  1. Produce the same exact display (and all the necessary back-end integrations) on the dAppNode.

Once step 5 is complete, we will consider the project complete, however, we recognize that Giveth will have an ongoing issue maintaining this system. As you’ll see in the original proposal, we discussed the need to “Help define methods for Giveth to bring the above function in-house or find a dedicated node endpoint for ongoing monitoring”. We will remain engaged until this is resolved. There are three possible paths:

  1. Giveth can run the Tokenomics monitoring code (and a version of Erigon) in-house,
  2. Enter into a for-pay relationship with TrueBlocks to maintain the Tokenomics website,
  3. Install and run TrueBlocks and Erigon on a dAppNode and run the data in a decentralized manner.

This third option is of most interest to us, as this is our primary path for our development efforts. It would also allow us to exercise the muscles needed to provide such a piece of software to end users. Something we think we will be fully ready to do once this project is completed.

Conclusion

In short, we’ve come a long way, but there is more work to be done. We hope to have substantial completion of the project by mid to late July.

Thanks for reading and we welcome any questions/comments.

5 Likes

Wow! i have been wanting a dappnode package for years. This is great news!

2 Likes

This is a very happy solution, love seeing how projects in the Giveth Galaxy rise into ever greater collaboration for improving the ecosystem as a whole!

Looking deeper into how the monitoring tool gives us data for identifying recirculation, I think about the mitigation portion of this proposal.

As someone who makes a lot of donations AND helps a lot of project owners raise/disburse funds - if there is any way I can support the process including identifying best practices that are revealed through data analysis, particularly around communicating them to donors and/or project owners please let me know.

3 Likes

We made some decent progress the project this week, primarily in two areas: (1) the data site (https://tokenomics.io/giveth) which is now being updated continually. If you’re a data scientist, you may find some of it’s data interesting. We’d love your feedback.

  1. TrueBlocks is now dockerized and will run on dAppNode (although we’ve not yet officially released it).

Some of the Giveth community may be interest in this data for themselves. Who knows, perhaps there are data scientists out there that want to see very on-chain transaction (and trace and log and neighbor and account balance) for all the Giveth projects.

The second thing we’re working on is the ‘scanning the history for circulations.’ I’ve gotten as far as being able to find all the senders to all donors in a round which is the first step. Once we have that list, then we can recursively find the “senders to the senders to the donors” and so on. If there’s an infraction, it will be in the list of senders to senders to senders ad infinitum (or at least back to the start of Giveth). We spent time reviewing the round 17 data by hand. I hope to use the tool to review the round 18 data this coming week.

Just a quick example of what we have so far.

We’ve looked back two levels – so at “senders to donors” from round 17. That is, for each donor, we’ve looked at all the addresses that ever sent into those addresses.

We found that the following projects, which are on the purple list, have sent at least one transaction to one of those donors at some point in the past.

This is very likely totally normal.

It’s likely that these addresses are interrelated. The next goal is to see if any of these gave to a donor who then gave back to them. That code is not yet finished.

31-Giveth:Project    0x2bf034eccebc8cd60dab9c249b6c2996dcb7d8ec    Moloch Cloudship
31-Giveth:Project    0x41cb654d1f47913acab158a8199191d160dabe4a    Coletivo XV
31-Giveth:Project    0x5a9ce898f0b03c5a3cd2d0c727efdd0555c86f81    Shenanigan
31-Giveth:Project    0x6d97d65adff6771b31671443a6b9512104312d3d    Life - New Creation Association
31-Giveth:Project    0x7554f10da3ed7128300577e55abcd8f8835bcee4    Diamante Bridge Collective
31-Giveth:Project    0x8ba6cd4f47e0ad7240287a292cdfe22a4d50e075    BrightID Proof of Uniqueness
31-Giveth:Project    0xdfb0b22940cc45283d639d22ba26aa55ab8bfee8    Crypto Commons Association
31-Giveth:Project    0xedd425359fb15e894c639b6a74112954486146b9    Diamante Luz Center for Regenerative Living
                        0x5b0f8d8f47e3fdf7ee1c337abca19dbba98524e6
                        0x809c9f8dd8ca93a41c3adca4972fa234c28f7714
                        0xe711f474efeb84dd554731b8587b2753d86fe138

This is similar at two levels deep to what Giveth does automatically in their API at one-level deep by checking the purple list directly against donors.

For that one-level view, they system asks, “is the donor on the purple list.” The above asks the question, “does the donor have any senders in his/her past that are on the purple list.”

It’s not quite the same, but it’s progress. More next week…

3 Likes