Definition of ContestNode.Ref and locating remote object #5

Open
opened 2026-04-07 16:30:57 +02:00 by maleszka · 1 comment
Owner

Platforms I wanted to support are:

I started working on implementing other backends than sim to check whether the current design works well with different platforms. And I'm surprised to see how much of a different approach some platforms take on identifying contests/rounds/problems.

Basing on sim, I decided that:

  • ContestNode.Ref is a packed struct that contains data necessary to download more information about given entity from remote server
  • ContestNode.Ref is backed by u64 which is then used as vertex id in the graph
  • ContestNode.Ref is thus unique for all nodes; it serves a role of primary key in the contest_nodes relation

But we have 2 problems with it:

  1. in all platforms other than sim, knowing ContestNode ancestors are needed to download info about specific node; i.e. it's not possible to tell, which contest does given problem belong to without knowing ancestor ids, cause all api calls asking about specific problem need to also have the contest context specified
  2. sio2, szkopul, and solve use string codenames to identify contests 😭

Problem 1 is quite easy to solve, cause we can just add an ancestors: []const platforms.data.ContestNode argument to methods in PlatformAPI.

Problem 2 is much harder, cause it tells us that we actually have two things to solve:

  • how to reconstruct all the data required to identify some entity on the web platform? how to encode codenames and string refs?
  • how to identify vertex in the graph? what should be the primary key of a relation?
Platforms I wanted to support are: - https://satori.tcs.uj.edu.pl/ - https://codeforces.com/ - https://sim.13lo.pl/ - https://solve.edu.pl/ - https://szkopul.edu.pl/ - https://sio2.mimuw.edu.pl/ I started working on implementing other backends than sim to check whether the current design works well with different platforms. And I'm surprised to see how much of a different approach some platforms take on identifying contests/rounds/problems. Basing on sim, I decided that: - `ContestNode.Ref` is a packed struct that contains data necessary to download more information about given entity from remote server - `ContestNode.Ref` is backed by `u64` which is then used as vertex id in the graph - `ContestNode.Ref` is thus unique for all nodes; it serves a role of primary key in the `contest_nodes` relation But we have 2 problems with it: 1. in _all_ platforms other than sim, knowing `ContestNode` ancestors are needed to download info about specific node; i.e. it's not possible to tell, which contest does given problem belong to without knowing ancestor ids, cause all api calls asking about specific problem need to also have the contest context specified 2. sio2, szkopul, and solve use string codenames to identify contests 😭 Problem 1 is quite easy to solve, cause we can just add an `ancestors: []const platforms.data.ContestNode` argument to methods in `PlatformAPI`. Problem 2 is much harder, cause it tells us that we actually have two things to solve: - how to reconstruct all the data required to identify some entity on the web platform? how to encode codenames and string refs? - how to identify vertex in the graph? what should be the primary key of a relation?
Author
Owner

There are two things that I think we have to do nonetheless:

  • add ancestors argument to PlatformAPI methods
  • add some field to ContestNode that will be a serialized implementation-internal struct, which will contain additional information (although I hope that we will avoid that)

When it comes to the vertex id, we have 3 options:

  1. uses slices []const u8 as primary key; will need to change implementation of Relation and fetcher.ContestGraph but will give us this flexibility that key can be of arbitrary length
  2. store u64 being std.crypto.hash on something that is unique for the contest node; currently, sim has 6500 contest nodes, codeforces has about 20000 nodes; assuming 10^7 nodes at most, probability of collision with u64 is 0.0002% and ridiculously small with u128
  3. bump Ref to u256, which allows for storing 42 chars long string in base64; we can assume that codename is simply some url-safe base64 value (they never go beyond [a-zA-Z0-9-_] alphabet).

I would stick with option 2, for now. It does not require any changes in the codebase and the probability seems very small. We can bump Ref to u128 in the worst case.

There are two things that I think we have to do nonetheless: - add `ancestors` argument to `PlatformAPI` methods - add some field to `ContestNode` that will be a serialized implementation-internal struct, which will contain additional information (although I hope that we will avoid that) When it comes to the vertex id, we have 3 options: 1. uses slices `[]const u8` as primary key; will need to change implementation of `Relation` and `fetcher.ContestGraph` but will give us this flexibility that key can be of arbitrary length 2. store `u64` being `std.crypto.hash` on something that is unique for the contest node; currently, sim has 6500 contest nodes, codeforces has about 20000 nodes; assuming 10^7 nodes at most, probability of collision with `u64` is 0.0002% and ridiculously small with `u128` 3. bump `Ref` to `u256`, which allows for storing 42 chars long string in base64; we can assume that codename is simply some url-safe base64 value (they never go beyond [a-zA-Z0-9-_] alphabet). I would stick with option 2, for now. It does not require any changes in the codebase and the probability seems very small. We can bump Ref to `u128` in the worst case.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
maleszka/wbij#5
No description provided.