HR Onboarding Tool

AI-powered employee onboarding platform — SG Innovation Challenge 2025

Python Streamlit LangChain FAISS Groq LLM OpenAI Whisper

Overview

The HR Onboarding Tool is an AI-powered employee onboarding platform built for the SG Innovation Challenge 2025, where it reached the finalist round. The platform replaces static onboarding documents and one-size-fits-all training programmes with an interactive assistant that answers policy questions, guides new hires through scenario-based exercises, and gives HR teams a live analytics view of onboarding progress.

With this tool, we want new hires to be able to ask "how do I apply for leave?" and get an accurate, localised answer instantly, instead of submitting a HR ticket and waiting for a response, or combing through pages of a PDF handbook.

SG Innovation Challenge 2025, Finalist. Competed against teams from across universities and polytechnics in Singapore.

The Problem

Traditional employee onboarding is slow, passive, and hard to measure. New hires receive a stack of documents, and sit through generic presentations often on the first week or two, for scenarios they won't encounter until months later. HR teams have little visibility into whether the information is actually being absorbed, or where the most usual doubts are.

For HR teams, there is no easy way to know whether a new hire has actually understood company policy, or to identify which areas are causing the most confusion.


The Solution

The HR Onboarding Tool replaces the static handbook with a conversational interface backed by the company's own documents. New hires get accurate, sourced answers instantly, when they actually need the information. HR gets visibility into where people are struggling, and all are able to design a personalised onboarding journey with interactive scenario-based training.

Demonstration of the dynamic quiz generation, and the AI assistant.


Features

AI Assistant

New hires can ask questions about company policies, or questions in onboarding trainings that they have doubts about,which is then passed to the Retrieval-Augmented Generation (RAG) pipeline for an answer. The customised response is then given to the user, answering their questions in real time.

Document Search and Retrieval

Company policy documents and handbooks are chunked, embedded, and indexed in a FAISS vector store. LangChain retrieves the most relevant passages for each query and passes them to the LLM as context. Hence, answers are grounded in source documents, and minimise hallucination.

Scenario-Based Training

Interactive workplace scenarios test whether new hires can apply company policy in realistic situations. For example, one scenario might be how to deal with a difficult customer. The scenario is generated in real time by Llama3, and converted to speech by Kokoro TTS. Upon the user's input, OpenAI Whisper transcribes the audio, feeding it back to the LLM. Finally, the LLM evaluates free-text responses, gives constructive feedback, and logs scores in SQLite for HR review.

HR Analytics Dashboard

HR managers see an aggregated view - which modules have been completed, average scenario scores by department, and which policy areas generate the most questions. Built in Streamlit with Plotly charts.


Architecture

LayerTechnologyResponsibility
Frontend Streamlit Chat UI, voice recorder, dashboard visualisations
RAG pipeline LangChain + FAISS Document chunking, embedding, retrieval, and prompt assembly
LLM Llama 3 Answer generation, scenario evaluation, feedback
Speech OpenAI Whisper Audio transcription for voice queries
Storage SQLite User progress, scenario scores, session history

RAG query flow:

  1. User types or speaks a question
  2. Whisper transcribes audio (voice path) → text query
  3. Query embedded and searched against FAISS index
  4. Top-k chunks retrieved → assembled into LLM prompt with source metadata
  5. Llama 3 generates a grounded answer
  6. Response displayed in chat and interaction is logged to SQLite