Skip to content

justinko/strongmind

Repository files navigation

StrongMind

Rails API that ingests GitHub public PushEvent records, enriches actors and repositories, and stores structured data in PostgreSQL.

Prerequisites

  • Docker and Docker Compose v2

Start the system

From the repository root:

docker compose up --build

This builds the app image from Dockerfile.dev, starts PostgreSQL, runs migrations via the entrypoint (db:prepare), and runs the Rails server on port 3000.

Follow logs with:

docker compose logs -f app db

Compose sets RAILS_LOG_TO_STDOUT=true for the app and ingest services so Rails logs go to stdout (visible with docker compose logs -f).

Run ingestion

In another terminal (with docker compose up running so db is available):

docker compose run --rm ingest

This runs bin/rails github:ingest and fetches from the GitHub public events API.

Run tests

docker compose run --rm test

This sets RAILS_ENV=test, prepares the test database, and runs bin/rails test.

How to verify it’s working

What to expect in logs

After docker compose run --rm ingest, you should see lines similar to:

  • EventIngester: starting ingestion
  • On successful imports (log level debug): EventIngester: stored push event id=...
  • A completion line: EventIngester: ingest complete — X imported, Y skipped, Z failed (...) including [rate limit remaining: ...] when the client still has a last HTTP response

If GitHub rate-limits the events fetch or enrichment, you may see: EventIngester: rate limit reached (resets at ...), aborting.

What to check in the database

With the stack up (docker compose up), inspect counts:

docker compose exec app bin/rails runner "puts({ push_events: PushEvent.count, actors: Actor.count, repositories: Repository.count })"

Or a one-off container:

docker compose run --rm app bin/rails runner "puts PushEvent.order(:created_at).last&.attributes"

You should see rows in push_events, actors, and repositories after a successful ingest (exact counts depend on how many PushEvent items GitHub returned and how many were new).

How long until results appear

Ingestion is a single rake run: it usually finishes within seconds to a minute, depending on network and rate limits. Rows appear as soon as the task completes without error.

Local development without Docker

Requires Ruby (see .ruby-version), PostgreSQL, and bundle install. Ensure PostgreSQL matches config/database.yml, create databases if needed, then:

bin/rails db:prepare
bin/rails github:ingest
bin/rails test

Data model

actors

One row per GitHub user seen in ingested events. github_id is the GitHub user id (unique). login is the GitHub username from the event. raw_payload is the JSON from GET /users/:login (Octokit client.user(login)).

repositories

One row per GitHub repository seen in ingested events. github_id is the GitHub repository id (unique). name stores the full name (owner/repo) from the enriched repo. raw_payload is the JSON from GET /repositories/:id (Octokit client.repository(id)).

New actors and repositories trigger a single API fetch each; repeats in the same or later runs reuse rows and skip extra HTTP calls.

push_events

Each row is one GitHub PushEvent with:

  • raw_payload — full event.to_h from Octokit (JSONB) for audit and debugging.
  • actor_id / repository_id — foreign keys to enriched actors and repositories.
  • Structured columns (queryable without JSON parsing), mapped from the event payload:
    • push_idpayload.push_id
    • refpayload.ref
    • headpayload.head (commit SHA)
    • before_shapayload.before (prior commit SHA; column name avoids Ruby keyword before)

github_event_id is the GitHub event id string and remains the natural unique key for idempotent ingestion.

About

Code challenge for StrongMind

Resources

Stars

Watchers

Forks

Contributors

Languages