/ Big data

Exploring relationships on the ledger: Part two

Thomas Silkjær

Thomas Silkjær

Creative Director at 2K/DENMARK. Enthusiastic about the Internet of Value and everything XRP and Coil. Hobby ledger investigator. Opinions are my own and not the views of my employer.

Read More

A series on how the decentralised XRP ledger can be used to explore relationships between wallets using a graph database. This second part focuses on the initial population and initial exploration.

Picking up where part one left off, I was looking at months of importing from rippled nodes. Looking at alternative solutions, I came across the rippled-historical-database, especially this:

The Backfiller is best for filling in relatively short histories of transactions. Importing a complete history of all XRP Ledger transactions using the Backfiller could take weeks. If you want a full history, we recommend acquiring a database dump with early transactions and importing it directly.

Wietse Wind came to the rescue and provided me with a 2.5 TB SQLite dump of the complete transaction history.

SQLite population

Rewriting the script to support SQLite was pretty straightforward – instead of querying ledgers and transactions from an API, I could query a local SQLite database directly instead. After about a week spent downloading data and populating the database, I finally had my local Neo4j database ready. Stats per December 20 (ledger 43,834,340):

  • Nodes: 45,490,668
    • Wallets: 1,446,228
    • Payments: 44,044,440
  • Relations: 89,534,972 million
    • Has sent: 44,044,440
    • Has received: 44,044,440
    • Has activated: 1,446,092

The model is quite simple, only containing two types of nodes and three types of relations.

Schema-1

If you want to populate your own local or hosted database, I have made the scripts available here.

Release preparation

As my goal is making the database publicly available, I spent some of the waiting time to prepare a “tutorial style” browser guide to complement the database browser.

Already having made a simple one in my initial tests, which can be seen in the video of part one, I built on top of that but explored other simple queries for exploring the network. I want to make sure that everybody can connect and get useful information, harnessing the power of the database without much learning first. Here are some examples of queries, written in the Cypher Graph Query Language:

Shortest path

Find the shortest path between two wallets, not longer than 10 connections away, only through wallet activations:

MATCH(w1:Wallet {address:"wallet one address"})
MATCH(w2:Wallet {address:"wallet two address"})
MATCH path = shortestPath( (w1)-[:HAS_ACTIVATED*..10]-(w2) )
RETURN w1,w2,path

Shortest path

Or finding the shortest path only through only payments between two wallets:

MATCH(w1:Wallet {address:"wallet one address"})
MATCH(w2:Wallet {address:"wallet two address"})
MATCH path = shortestPath( (w1)-[:HAS_SENT|:HAS_RECEIVED*..5]-(w2) )
RETURN w1,w2,path

Shortest path 2

Payments and dates

Find all payments of more than 50,000 XRP that happened between December 15 and December 19, 2018, from one specific wallet.

MATCH (w1:Wallet {address:"wallet address"})-[:HAS_SENT]->(payment:Payment)-[:HAS_RECEIVED]->(w2:Wallet)
WHERE payment.amount > 50000
AND payment.currency = "XRP"
AND datetime("2018-12-15") <= payment.date <= datetime("2018-12-19")
RETURN w1, payment, w2

Payments and dates

Finding descendants (family tree)

Find all descendants in two generations from a wallet.

MATCH(w1:Wallet {address:"wallet address"})-[r1:HAS_ACTIVATED]->(w2:Wallet)-[r2:HAS_ACTIVATED*0..1]->(w3:Wallet)
RETURN w1,w2,w3,r1,r2

Descendants

Making it available

Neo4j reached out early in the process, and they have offered to sponsor hosting of the dataset to make it publicly available! Not alone hosting, but also offering help an insight in applying graph algorithms and other exciting possibilities. Very much appreciated, thank you!

Personally, I can't wait to see what we can learn from this. I almost feel like a kid at Christmas, waiting for opening his presents (wow, timing!).

What's next? Public access and first dive into the data, stay tuned for part three!

Part one is available here.


Did you like this post by Thomas Silkjær?

Send some love:

Thomas Silkjær

Thomas Silkjær

Creative Director at 2K/DENMARK. Enthusiastic about the Internet of Value and everything XRP and Coil. Hobby ledger investigator. Opinions are my own and not the views of my employer.

Read More