7: Data Wrangling with JavaScript

Speed drill on transforming data — the bread and butter of fintech frontends. Each exercise uses patterns from real interviews: reshaping API responses, aggregating transactions, building lookup structures. The deeper content covers TypeScript utility types, generic pipeline helpers, floating point traps in financial data, and performance characteristics at scale.


1. Reshaping Objects — Pick, Rename, and TypeScript Utility Types (~3 min)

APIs rarely return data in the shape your UI needs. The most common transform: pick a subset of fields and rename them.

type ApiUser = {
  id: number
  first_name: string
  last_name: string
  email_address: string
  department_id: number
  is_active: boolean
}

const apiUsers: ApiUser[] = [
  { id: 1, first_name: 'Alice', last_name: 'Chen', email_address: 'alice@ramp.com', department_id: 10, is_active: true },
  { id: 2, first_name: 'Bob',   last_name: 'Park', email_address: 'bob@ramp.com',   department_id: 20, is_active: false },
  { id: 3, first_name: 'Carol', last_name: 'Diaz', email_address: 'carol@ramp.com', department_id: 10, is_active: true },
]

Transform apiUsers into active users only, camelCase keys, full name combined:

type DisplayUser = {
  id: number
  fullName: string
  email: string
}

const displayUsers: DisplayUser[] = apiUsers
  .filter(u => u.is_active)
  .map(u => ({
    id: u.id,
    fullName: `${u.first_name} ${u.last_name}`,
    email: u.email_address,
  }))

// [
//   { id: 1, fullName: 'Alice Chen', email: 'alice@ramp.com' },
//   { id: 3, fullName: 'Carol Diaz', email: 'carol@ramp.com' },
// ]

Filter before map. Filtering first means the transform in .map() only runs on items that will actually appear in the output. For small arrays this doesn’t matter; for thousands of records with an expensive transform, the order is measurable.

TypeScript utility types for reshape operations. Pick and Omit let you derive output types from input types without duplicating field definitions:

// Pick only the fields you need — output type stays in sync with ApiUser
type UserPreview = Pick<ApiUser, 'id' | 'first_name' | 'last_name'>

// Omit fields you don't want — useful when the output is the input minus a few sensitive fields
type PublicUser = Omit<ApiUser, 'department_id' | 'is_active'>

// Partial — all fields optional (useful for patch operations)
type UserPatch = Partial<ApiUser>

// Required — all fields required (useful when you know the response is complete)
type FullUser = Required<ApiUser>

These compose: Pick<Partial<ApiUser>, 'id' | 'email_address'> gives you an object where only id and email_address can appear, both optional.


2. Building Lookup Maps — Array to Record (~3 min)

O(1) access by ID instead of O(n) scanning. The difference is invisible at 10 items and severe at 10,000.

type Department = {
  id: number
  name: string
  budget: number
}

const departments: Department[] = [
  { id: 10, name: 'Engineering', budget: 500000 },
  { id: 20, name: 'Marketing',   budget: 200000 },
  { id: 30, name: 'Sales',       budget: 300000 },
]

Two approaches and when each is better:

// Object.fromEntries — cleanest for simple key→value maps
const deptMap = Object.fromEntries(
  departments.map(d => [d.id, d])
) as Record<number, Department>

// reduce — necessary when you need to transform or accumulate while building
const deptMap2: Record<number, Department> = departments.reduce(
  (acc, dept) => ({ ...acc, [dept.id]: dept }),
  {} as Record<number, Department>
)

// Usage
deptMap[10].name  // 'Engineering'

Map vs plain object. A Record<string, V> is a plain object — keys are strings (or numbers, coerced to strings). JavaScript’s Map supports any key type and has better performance characteristics for frequent insertions/deletions:

// Map: better when keys aren't strings, or when you need insertion order / size
const deptMapM = new Map(departments.map(d => [d.id, d]))
deptMapM.get(10)?.name  // 'Engineering'
deptMapM.size           // 3
deptMapM.has(99)        // false — no coercion surprises

// Plain object: better for JSON serialization and when keys are string IDs

Handling missing references. A lookup can return undefined if the key doesn’t exist. TypeScript’s Record<number, V> doesn’t reflect this — it claims every numeric key has a value. Be defensive:

const dept = deptMap[user.department_id]
if (!dept) throw new Error(`Unknown department ID: ${user.department_id}`)

// Or use optional chaining and a fallback
const deptName = deptMap[user.department_id]?.name ?? 'Unknown'

This matters especially for joins against data from different API endpoints that may be out of sync.

A reusable keyBy helper. This pattern appears so often that it’s worth a generic helper:

function keyBy<T, K extends string | number>(
  items: T[],
  getKey: (item: T) => K
): Record<K, T> {
  return Object.fromEntries(items.map(item => [getKey(item), item])) as Record<K, T>
}

const byId    = keyBy(departments, d => d.id)    // Record<number, Department>
const byName  = keyBy(departments, d => d.name)  // Record<string, Department>

3. Grouping — The Most Common Interview Pattern (~5 min)

Grouping is everywhere: transactions by category, users by department, events by date.

type Transaction = {
  id: string
  merchant: string
  amount: number
  category: 'software' | 'travel' | 'meals' | 'office'
  date: string
}

const transactions: Transaction[] = [
  { id: 't1', merchant: 'AWS',         amount: 2400, category: 'software', date: '2026-02-01' },
  { id: 't2', merchant: 'Delta',       amount: 580,  category: 'travel',   date: '2026-02-03' },
  { id: 't3', merchant: 'GitHub',      amount: 44,   category: 'software', date: '2026-02-03' },
  { id: 't4', merchant: 'Sweetgreen',  amount: 18,   category: 'meals',    date: '2026-02-01' },
  { id: 't5', merchant: 'Staples',     amount: 120,  category: 'office',   date: '2026-02-05' },
  { id: 't6', merchant: 'Uber Eats',   amount: 32,   category: 'meals',    date: '2026-02-05' },
  { id: 't7', merchant: 'Vercel',      amount: 20,   category: 'software', date: '2026-02-01' },
  { id: 't8', merchant: 'Hilton',      amount: 340,  category: 'travel',   date: '2026-02-05' },
]

Group by category:

// Modern: Object.groupBy (ES2024 — supported in all evergreen browsers, Node 21+)
const byCategory = Object.groupBy(transactions, t => t.category)
// Record<string, Transaction[] | undefined> — note: values can be undefined

// Classic: reduce — know this for interviews; not all environments have Object.groupBy
// Also lets you control the exact output type
function groupBy<T, K extends string>(items: T[], getKey: (item: T) => K): Record<K, T[]> {
  return items.reduce((acc, item) => {
    const key = getKey(item)
    if (!acc[key]) acc[key] = []
    acc[key].push(item)
    return acc
  }, {} as Record<K, T[]>)
}

const byCat = groupBy(transactions, t => t.category)

Object.groupBy returns T[] | undefined per key. TypeScript types the values as T[] | undefined because the type system can’t prove that every key will have entries. Narrow before use:

const softwareTxns = byCategory['software'] ?? []

Group then aggregate:

type CategorySummary = {
  category: string
  total: number
  count: number
  avgAmount: number
}

const summaries: CategorySummary[] = Object.entries(byCat).map(([category, txns]) => {
  const total = txns.reduce((sum, t) => sum + t.amount, 0)
  return {
    category,
    total,
    count: txns.length,
    avgAmount: Math.round(total / txns.length),
  }
})

// [
//   { category: 'software', total: 2464, count: 3, avgAmount: 821 },
//   { category: 'travel',   total: 920,  count: 2, avgAmount: 460 },
//   { category: 'meals',    total: 50,   count: 2, avgAmount: 25  },
//   { category: 'office',   total: 120,  count: 1, avgAmount: 120 },
// ]

Don’t try to group and aggregate in a single reduce. It’s harder to read, harder to test, and harder to debug. Group first, then aggregate each group in a separate pass.


4. Joining Data from Multiple Sources (~5 min)

Stitching data from different API endpoints — the JavaScript equivalent of a SQL JOIN. Correctness depends on building lookup maps first; performance depends on not doing nested O(n) scans.

type Expense = {
  id: string
  userId: number
  amount: number
  description: string
}

const expenses: Expense[] = [
  { id: 'e1', userId: 1, amount: 150,  description: 'Team lunch'     },
  { id: 'e2', userId: 1, amount: 2400, description: 'AWS monthly'    },
  { id: 'e3', userId: 2, amount: 580,  description: 'Flight to NYC'  },
  { id: 'e4', userId: 3, amount: 44,   description: 'GitHub copilot' },
  { id: 'e5', userId: 3, amount: 89,   description: 'Office supplies'},
]

Join pattern — build maps first, then enrich in one pass:

// O(n) to build maps, O(n) to enrich — total O(n)
const userMap  = keyBy(apiUsers,    u => u.id)
const deptMap4 = keyBy(departments, d => d.id)

type EnrichedExpense = {
  id: string
  amount: number
  description: string
  employeeName: string
  department: string
}

const enriched: EnrichedExpense[] = expenses.map(e => {
  const user = userMap[e.userId]
  if (!user) throw new Error(`Unknown userId: ${e.userId}`)
  const dept = deptMap4[user.department_id]
  if (!dept) throw new Error(`Unknown department_id: ${user.department_id}`)

  return {
    id: e.id,
    amount: e.amount,
    description: e.description,
    employeeName: `${user.first_name} ${user.last_name}`,
    department: dept.name,
  }
})

The O(n²) trap. The naive join does Array.find() inside .map() — every expense searches the entire users array:

// O(n * m) — fine for small data, quadratic at scale
const naive = expenses.map(e => {
  const user = apiUsers.find(u => u.id === e.userId)  // O(m) scan per expense
  // ...
})

At 1,000 expenses × 500 users, that’s 500,000 comparisons. With lookup maps it’s 1,500 operations. In a browser at 60fps, 500K comparisons are noticeable; in a server-rendered report over millions of rows, the difference is orders of magnitude.


5. Sorting and Ranking (~4 min)

Multi-criteria sorting and rank derivation.

type Employee = {
  name: string
  department: string
  totalSpend: number
}

const employees: Employee[] = [
  { name: 'Alice Chen', department: 'Engineering', totalSpend: 2550 },
  { name: 'Bob Park',   department: 'Marketing',   totalSpend: 580  },
  { name: 'Carol Diaz', department: 'Engineering', totalSpend: 133  },
  { name: 'Dan Kim',    department: 'Marketing',   totalSpend: 920  },
  { name: 'Eve Liu',    department: 'Engineering', totalSpend: 1800 },
]

Multi-criteria sort:

// Always spread before sorting — .sort() mutates in place
const sorted = [...employees].sort((a, b) => {
  // Primary: department ascending
  const deptCmp = a.department.localeCompare(b.department)
  if (deptCmp !== 0) return deptCmp
  // Secondary: spend descending
  return b.totalSpend - a.totalSpend
})

// Engineering: Alice (2550), Eve (1800), Carol (133)
// Marketing:   Dan (920),   Bob (580)

Why localeCompare for strings. The naive a > b ? 1 : -1 string comparison doesn’t handle Unicode, locale-specific collation (ä sorts differently in German vs Swedish), or case consistently. localeCompare is the correct API for human-readable string comparison.

Stable sort. JavaScript’s .sort() has been guaranteed stable since ES2019 (Chrome 70, Firefox 3, Safari 10.1, Node 11). A stable sort preserves the original order of elements that compare as equal. Before ES2019, engines were free to use unstable algorithms — a historical gotcha in older codebases.

Rank derivation with flatMap:

type RankedEmployee = Employee & { deptRank: number }

const ranked: RankedEmployee[] = Object.values(
  groupBy(sorted, e => e.department)
).flatMap(group =>
  group.map((emp, i) => ({ ...emp, deptRank: i + 1 }))
)

flatMap maps each group to a RankedEmployee[] and flattens the result one level. The alternative — .map(...).flat() — is equivalent but two method calls.


6. Floating Point and Cents (~2 min, but important)

Financial data almost always stores amounts as integer cents. The reason is floating point:

console.log(0.1 + 0.2)         // 0.30000000000000004
console.log(1.005.toFixed(2))  // "1.00" — rounds wrong due to float representation

The fix is to work in cents throughout and convert only for display:

// Store and compute in cents
const amountCents = 240000  // $2400.00

// Display only
function centsToDisplay(cents: number): string {
  return (cents / 100).toLocaleString('en-US', { style: 'currency', currency: 'USD' })
}
// centsToDisplay(240000) → "$2,400.00"

// When you must add float dollars: multiply to cents first
function dollarsToCents(dollars: number): number {
  return Math.round(dollars * 100)
}

Branded types for domain safety. A common production pattern is branding primitive types to prevent accidentally passing dollars where cents are expected:

type Cents = number & { readonly __brand: 'Cents' }

function toCents(dollars: number): Cents {
  return Math.round(dollars * 100) as Cents
}

function addCents(a: Cents, b: Cents): Cents {
  return (a + b) as Cents
}

// TypeScript error: Argument of type 'number' is not assignable to parameter of type 'Cents'
addCents(1200, 800)          // error — raw numbers aren't Cents
addCents(toCents(12), toCents(8))  // ok

This is more ceremony than most codebases need, but it surfaces the entire class of “wrong unit” bugs at compile time.


7. The Pipeline — Putting It All Together (~10 min)

Given raw API data, produce a specific output shape. The correct approach: filter → enrich → group → aggregate, each step independent and testable.

type RawTransaction = {
  id: string
  card_id: string
  merchant_name: string
  amount_cents: number
  category: string
  created_at: string  // ISO 8601
  status: 'posted' | 'pending' | 'declined'
}

type CardInfo = {
  id: string
  holder_name: string
  department: string
}

type DepartmentReport = {
  department: string
  totalDollars: number
  transactionCount: number
  topMerchant: string       // merchant with highest single transaction
  cardHolders: string[]     // unique, sorted alphabetically
}

Implementation:

function buildDepartmentReport(
  transactions: RawTransaction[],
  cards: CardInfo[]
): DepartmentReport[] {
  // Step 1: Build lookup map — O(cards)
  const cardMap = keyBy(cards, c => c.id)

  // Step 2: Filter posted only, enrich with card data, convert cents to dollars
  type Enriched = {
    merchant: string
    dollars: number
    department: string
    holderName: string
  }

  const enriched: Enriched[] = transactions
    .filter(t => t.status === 'posted')
    .map(t => {
      const card = cardMap[t.card_id]
      if (!card) throw new Error(`Unknown card_id: ${t.card_id}`)
      return {
        merchant: t.merchant_name,
        dollars: t.amount_cents / 100,
        department: card.department,
        holderName: card.holder_name,
      }
    })

  // Step 3: Group by department
  const byDept = groupBy(enriched, t => t.department)

  // Step 4: Aggregate each group
  return Object.entries(byDept).map(([department, txns]) => {
    const totalDollars = txns.reduce((sum, t) => sum + t.dollars, 0)
    const topMerchant  = txns.reduce((top, t) => t.dollars > top.dollars ? t : top).merchant
    const cardHolders  = [...new Set(txns.map(t => t.holderName))].sort()

    return { department, totalDollars, transactionCount: txns.length, topMerchant, cardHolders }
  })
}

[...new Set(arr)] for deduplication. Set stores unique values in insertion order. Spreading into an array gives you a deduplicated array in O(n). The naive alternative — .filter((v, i, arr) => arr.indexOf(v) === i) — is O(n²) because indexOf scans from the start for every element.

Top merchant via reduce. Finding the maximum-amount transaction with reduce is the idiomatic one-pass approach. The reducer compares each transaction against the running maximum and returns whichever is larger. At the end, .merchant on the result gives the name.


8. Bonus Pipeline — Monthly Merchant Breakdown (~10 min)

Nested grouping: group by merchant, then by month within each merchant.

type Payment = {
  id: string
  merchant: string
  amount_cents: number
  employee_id: string
  date: string      // "YYYY-MM-DD"
  approved: boolean
}

type EmployeeInfo = { id: string; name: string; team: string }

type MerchantReport = {
  merchant: string
  totalDollars: number
  paymentCount: number
  monthlyBreakdown: { month: string; amount: number }[]  // sorted by month
  teams: string[]  // unique, sorted
}

Implementation:

function buildMerchantReport(
  payments: Payment[],
  employees: EmployeeInfo[]
): MerchantReport[] {
  const empMap = keyBy(employees, e => e.id)

  type Enriched = {
    merchant: string
    dollars: number
    month: string    // "YYYY-MM" — slice(0, 7) from "YYYY-MM-DD"
    team: string
  }

  const enriched: Enriched[] = payments
    .filter(p => p.approved)
    .map(p => {
      const emp = empMap[p.employee_id]
      if (!emp) throw new Error(`Unknown employee_id: ${p.employee_id}`)
      return {
        merchant: p.merchant,
        dollars: p.amount_cents / 100,
        month: p.date.slice(0, 7),  // "2026-01-15" → "2026-01"
        team: emp.team,
      }
    })

  const byMerchant = groupBy(enriched, p => p.merchant)

  return Object.entries(byMerchant)
    .map(([merchant, txns]) => {
      // Nested group for monthly breakdown
      const byMonth = groupBy(txns, t => t.month)
      const monthlyBreakdown = Object.entries(byMonth)
        .map(([month, monthTxns]) => ({
          month,
          amount: monthTxns.reduce((sum, t) => sum + t.dollars, 0),
        }))
        .sort((a, b) => a.month.localeCompare(b.month))  // lexicographic = chronological for ISO dates

      return {
        merchant,
        totalDollars: txns.reduce((sum, t) => sum + t.dollars, 0),
        paymentCount: txns.length,
        monthlyBreakdown,
        teams: [...new Set(txns.map(t => t.team))].sort(),
      }
    })
    .sort((a, b) => b.totalDollars - a.totalDollars)  // highest spend first
}

ISO date strings sort lexicographically. "2026-01".localeCompare("2026-02") gives the correct chronological order because the string format is designed for this: year first, then month, zero-padded. This only works for ISO 8601 — locale-formatted dates like “Jan 2026” do not sort this way.


Cheat Sheet — Methods, Complexity, and When to Use Them

MethodReturnsComplexityUse When
.filter(fn)New array (subset)O(n)Remove items that don’t match
.map(fn)New array (same length)O(n)Transform every item 1
.reduce(fn, init)AnythingO(n)Accumulate into a single value
.flatMap(fn)New flattened arrayO(n)Map then flatten one level
[...arr].sort(fn)Same array (mutated)O(n log n)Order items — always spread first
Object.fromEntries()ObjectO(n)[key, value] pairs → object
Object.entries()Array of [k, v]O(n)Object → iterable pairs
Object.values()Array of valuesO(n)Object → values only
Object.groupBy()Object of arraysO(n)Group by key (ES2024)
new Map(entries)MapO(n)Non-string keys, or need .size/.has()
new Set(arr)SetO(n)Deduplicate; fast .has() checks
[...new Set(arr)]New arrayO(n)Deduplicated array
arr.find(fn)Item or undefinedO(n)First match — build a map for repeated lookups
str.slice(0, 7)SubstringO(1)Extract “YYYY-MM” from ISO date

Complexity note. The difference between O(n) and O(n²) is invisible at n=20 and catastrophic at n=10,000. The rule: if you’re calling .find() inside a .map(), replace the .find() with a lookup map.