About

UTS Tech Festival (22 June –  26 June, 2026) is a week long festival of events, showcases, hackathons, workshops, masterclasses, seminars, competitions, industry-student engagement and much more! Our goal is to bring together students, academics and industry to foster learning, inspiration, share ideas and promote innovation.
Come along to the UTS AI Showcase which is part of the UTS Tech Festival 2026, to discover Artificial Intelligence, Deep Learning, Reinforcement Learning, Computer Vision, NLP, and Vision-Language fusion projects!
This the fifth iteration of the UTS AI Showcase that showcases Artificial Intelligence (AI) and Deep Learning projects developed by undergraduate, postgraduate, and higher degree research (HDR) students from the Faculty of Engineering and IT (FEIT).

Venue:

UTS Faculty of Engineering and Information Technology

UTS Building 11 CB11, Level-4
Broadway, Ultimo, NSW 2007, Australia
 
 

Date and Time:

Wednesday, June 24, 2026, 3.30pm – 7.30pm

Photo Gallery from 2025

Showcase Program

June 24, 3.00pm
Room# CB11.04.401
Participating Teams arrive and setup
June 24, 4.00pm
Room# CB11.04.401
AI Showcase Opening Ceremony
3.55pm
Room# C11.04.205, 400, 300, 301
AI Project Showcase & Judging Begins
4.30pm
Room# CB11.04.205
Demos
5.30pm
All Rooms
Food and Drinks
6.55pm
Judging and Voting Concludes
7.00pm
Room CB11.04.401
Presentation by Sponsor
7.20pm
Room CB11.04.401
Showcase Winners Announced
7.30pm
Room CB11.04.401
Closing Ceremony
7.30pm
AI Showcase Ends

Drone Lab Demo

Moon Explorer

This project develops an advanced autonomous Moon Explorer utilizing a rugged Unmanned Ground Vehicle (UGV) platform equipped with a high-resolution camera array to safely navigate and map the lunar surface. The integrated system leverages state-of-the-art computer vision models and real-time 3D reconstruction techniques to generate a highly accurate digital model of the rugged terrain for remote exploration, hazard avoidance, and detailed topological analysis. Additionally, it detects and dynamically identifies lunar craters in real time, enabling significantly safer surface navigation and directly supporting future scientific exploratory missions.

Room : CB11.04.205
Pod : 2

Teachable Drones

This desktop application enables hands-free piloting of a DJI Tello drone utilizing real-time computer vision hand gestures. Leveraging MediaPipe to track 21 distinct hand landmarks via a standard webcam feed, the system extracts a normalized pose vector to feed an on-device machine learning classifier that maps physical movements to flight commands such as takeoff, landing, and directional navigation. Operating entirely offline through a local Flask server to ensure data privacy, the application features an intuitive Electron UI that guides operators through a seamless three-step workflow: custom gesture recording, localized edge training, and live flight execution.

Room : CB11.04.205
Pod : 1

Participating Teams

Computer Vision Projects

SolarVision

This automated clean energy infrastructure inspection system uses real-time computer vision to detect solar farm defects from uploaded imagery, recorded videos, or a live drone feed. Designed for aerial monitoring, the pipeline isolates damaged or dirty cell regions, scores overall panel health, and estimates potential energy losses. This integrates these diagnostics into a custom service dashboard that automatically generates maintenance reports, schedules, and active work orders. By converting raw inspection footage into targeted operational insights, allowing operators to protect renewable energy output.

Subject: 42028 Deep Learning and CNN
Room : CB11.04.205
Pod : 1

WasteArm

WasteArm is an automated community recycling system bridges the gap between machine vision and physical robotics to eliminate manual sorting hazards. Equipped with a camera array and a smart transformer-based AI model, the system automatically recognizes everyday recyclables like aluminum cans and cardboard cartons. Moreover, The vision pipeline interfaces directly with a physical robotic arm, translating visual classifications into smooth, precise pick-and-place movements. The final execution delivers a safer, highly accessible waste management solution designed for local sorting centers.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 7

KitchenEye

KitchenEye, a hands-free kitchen assistant, which combines real-time object guidance with proactive safety monitoring to increase cooking independence for visually impaired users. Streaming video from Meta Smart Glasses, the system identifies culinary items and maps nearby environmental layouts. The vision pipeline reacts to simple voice commands to deliver spoken navigation directions and issue active audio alerts when hazards like knives or hot cookware are nearby. It builds user confidence and kitchen autonomy by providing continuous safety tracking during everyday cooking activities.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 11

PatchGuard

This geospatial road maintenance platform uses automated computer vision to detect, map, and catalog structural highway damage. By evaluating street-level photography and casual vehicle dashcam footage, the system identifies pothole formations and cracks without requiring specialized inspection vehicles. The backend instantly plots each localized damage marker onto an interactive map dashboard complete with annotated image verifications. It streamlines municipal engineering workflows, allowing road managers to rapidly review automated findings, plan target repairs, and maintain safer infrastructure.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 13

DiceCalc

This web utility automates numerical calculation workflows for complex multi-dice rolls during immersive tabletop roleplaying campaigns like Dungeons & Dragons. Embedding a Yolo26 object detection model into its processing pipeline, the application dynamically scans polyhedral dice arrays to speed up game mechanics by computing roll totals instantly for players. The flexible system architecture eliminates the pacing delays of manual calculations by allowing users to easily customize the baseline verification ruleset, seamlessly adapting the automated setup to the varied tracking guidelines of diverse tabletop gaming networks.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 9

Axon

This real-time wearable navigation aid deploys edge computer vision to actively assist visually impaired pedestrians. Utilizing a video stream pulled directly from Meta Ray-Ban smart glasses, the system runs an offline, custom YOLOv12-Small model to detect navigation hazards at 91 FPS. The architecture couples a priority-based alert framework with a local voice assistant powered by GPT-4o-mini to resolve natural-language spatial queries. Operating entirely offline with a sub-50 ms alert latency, it provides robust, hands-free orientation without relying on vulnerable, external cloud connectivity networks.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 8

DermaScanner

This diagnostic web application evaluates superficial facial skin issues using a confidence-aware hybrid deep learning structure. By coordinating YOLOv8n object detection with a backup ResNet18 classifier, it intelligently routes imagery to ensure precise real-time condition identification. The dual-model inference interface optimally balances edge resource demands with diagnostic classification accuracy to handle wide environmental variations in lighting, background noise, and superficial skin conditions.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 4

SpatialSync

This wearable assistive device enhances spatial understanding and object localization for visually impaired users. Running a lightweight YOLO26-small object detection model on mobile hardware, the system maps unknown or unfamiliar interior environments. The localization script integrates directly with the built-in speaker arrays of Meta Ray-Ban smart glasses. It acts as a real-time personal navigation assistant, converting visual spatial coordinates into directional audio signals to guide the wearer safely.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 3

LiPi Sort

This scalable mail automation program deciphers handwritten postal codes across varying lighting conditions and skew angles with a 98% accuracy score. Built around YOLO11s accompanied by automated rotation and contrast-correction pipelines, the framework integrates with physical robotic machinery to sort physical parcels into sorting bins efficiently. Additionally, The architecture removes some expensive infrastructure barriers, bringing the industrial sorting accessibility to minor postal operations.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 1

VisionScape

This system automatically converts 3D indoor scene data into clean, structured 2D floorplan layouts, addressing the traditional complexities and manual drafting demands of indoor scanning. Processing raw architectural metrics, the system evaluates room geometry, wall configurations, openings, and visual scene parameters to predict structured room shapes and generate comprehensive layouts. It serves as an advanced utility that accelerates documentation, renovation planning, and detailed spatial analysis.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 14

VTuber Studio

This accessible digital animation system maps human motion data onto virtual avatars using consumer-grade equipment. Capturing standard webcam feeds, the system deploys deep learning models to track the user’s face, body orientation, and fine hand gestures. The motion tracking pipeline translates telemetry parameters directly onto a 3D VRM avatar skeleton rendered inside a web browser. It provides a low-cost, browser-based runtime that enables interactive virtual performance tracking without specialized capture suits.

Subject: 42028 Deep Learning and CNN
Room : CB11.04.400
Pod : 5

FatigueSense

FatigueSense is a computer vision tracker which continuously measures human exhaustion levels in real time using nothing but only a standard desktop webcam. By translating physical behavior markers into a live productivity focus metric updated every second, the system helps users optimize daily rest schedules. The system runs to analyze facial fatigue markers and physical drowsiness indicators, empowering students and remote desk workers to actively protect their long-term health, wellness, and mental focus.

Subject: 42028 Deep Learning and CNN
Room : CB11.04.300
Pod : 4

ImConVo

This visual translation utility processes silent video footage to translate human lip movements directly into clear text strings. It features face-tracking components to isolate the mouth area and integrates an LLM to clean translation mistakes. The resulting system serves as an essential accessibility layer for individuals with auditory impairments or operations inside highly noisy corporate environments.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 10

Court IQ

This open-source basketball analytics program tracks player coordinates and catalogs shooting performance charts from standard video sources. Running locally on basic consumer hardware, it plots active scoring grids onto a 2D court representations to guide grassroots coaches. The processing layout extracts tactical values from baseline recordings to unlock data-driven coaching methods for amateur sports divisions.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 12

F1 Vision

This commercial analytics engine calculates commercial sponsor visibility performance during fast-paced Formula 1 race broadcasts. The underlying computer vision architecture monitors logo screen space and appearance frequencies to supply advertisers with automated return-on-investment data. By eliminating manual videography audits, it grants marketing agencies deep statistical telemetry on television placement value.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 6

FlowBust

This intelligent infrastructure screening system automatically identifies structural anomalies within municipal sewer pipelines. Scanning CCTV pipe imagery, the system deploys a fine-tuned YOLO model to pinpoint obstacles, ruptures, and material deformations. The development pipeline incorporates SAM-generated pixel-level masks during dataset preparation and it features a purpose-built interface where inspectors upload imagery, review annotated outputs, and export prioritized summaries.

Subject: 42028 Deep Learning and CNN Room : CB11.04.300
Pod : 1

GrabnGo

GrabnGo is a smart retail checkout demo that replaces queues and frustrating self-scanning with real-time product tracking. Cart cameras use YOLO26s for product recognition and checkout verification, while shelf cameras combine basket/item detection, engineered interaction features, and a classifier to detect suspicious behaviour. It helps reduce shrinkage losses while giving managers live sessions, cart values, checkout progress, security alerts, and clearer operational visibility in stores.

Subject: 42028 Deep Learning and CNN Room : CB11.04.300
Pod : 2

Sketch2Stage

This game design utility automatically converts photos of hand-drawn paper sketches into structured, digital game environments. Running on a custom YOLO architecture, the system parses structural elements like drawn platforms, symbols, and interactive zones. The vision pipeline bridges the gap between physical ideation and functional software assets, enabling creators to rapidly iterate on level concepts. The platform lowers the technical barrier for non-programming designers, drastically reducing initial prototyping timelines.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 15

PitchSense

PitchSense is an athletic performance tracker system which extracts actionable football telemetry and tactical movement trends from standard match footage. Leveraging machine learning and computer vision frameworks, the automated system captures player configurations and ball trajectories without specialized hardware. The tracking pipeline delivers immediately scannable performance metrics and positional maps through an intuitive interface, enabling coaches to make swift, data-backed strategic decisions.

Subject: 42028 Deep Learning and CNN Room : CB11.04.301
Pod : 2

FareGuard

This transit security system utilizes deep learning to detect public transportation fare evasion in real time. The architecture deploys a custom-trained YOLO model to dynamically map gate Regions of Interest (ROI) while utilizing YOLOv8-pose to track commuter trajectories. The computer vision backend automatically cross-references tracking data with hardware validation pings from physical QR code ticket scans. By flagging complex tailgating edge cases as "Uncertain," it seamlessly loops in human checkers for manual review.

Subject: 42028 Deep Learning and CNN Room : CB11.04.300
Pod : 5

VitalLens

This contactless biometric monitoring application measures core human physiological metrics through a browser interface. Using a standard laptop or smartphone webcam, the system employs remote photoplethysmography to track micro-color variations across the user's skin surface. The vision pipeline extracts and delivers real-time vital sign metrics without requiring extra wearables or physical hardware sensors. By bringing hardware tracking costs to zero, the web app places proactive health tracking tools in the hands of anyone with a phone.

Subject: 42028 Deep Learning and CNN Room : CB11.04.400
Pod : 2

Invigilo

This continuous proctoring application monitors live examination spaces through existing security cameras to uphold academic integrity. The tracking software monitors student posture adjustments, highlights cheating anomalies in real time, and saves a comprehensive activity log. By giving supervisors a digital eye that never blinks, the system ensures consistent room monitoring while keeping testing environments equitable and fair.

Subject: 42028 Deep Learning and CNN Room : CB11.04.301
Pod : 4

CrowdVision

This venue management web platform computes pedestrian volumes and density thresholds directly from live architectural video feeds. The framework generates immediate occupancy heatmaps and capacity warnings to help operations staff make rapid crowd control decisions. By pairing adaptive model tracking with venue size inputs, the system ensures swift logistical coordination across transportation hubs, stadiums, and crowded events.

Subject: 42028 Deep Learning and CNN Room : CB11.04.301
Pod : 5

ArchVision

ArchVision is an educational resource which uses a ResNet-50 ensemble architecture to identify architectural styles from a single building photograph. This system outputs architectural classifications across 45 styles, providing confidence ratings and context notes for users. The system serves as an interactive guide for real estate professionals, urban planners, and tourists seeking immediate access to regional design histories.

Subject: 42028 Deep Learning and CNN Room : CB11.04.301
Pod : 6

ImpactVision

This crisis mapping software processes satellite imagery to evaluate geographic infrastructure destruction within minutes of an environmental disaster. It segments buildings into intact, damaged, or destroyed states to provide emergency teams with immediate situational awareness. The un-complicated web visualization layer enables disaster response personnel and humanitarian groups to target relief resources efficiently without technical GIS expertise.

Subject: 42028 Deep Learning and CNN Room : CB11.04.301
Pod : 7

MediVision

MediVision is a medical diagnostic hub which screens eye, skin, and dental pathologies using an end-to-end double-stage ResNet50 vision pipeline. It provides clinical triage recommendations and outputs explicit Grad-CAM activation maps to deliver transparent diagnostic explainability. The speciality of this platform is it handles 24 distinct classification categories within an accessible web framework to streamline early-stage clinical workflows robustly.

Subject: 42028 Deep Learning and CNN Room : CB11.04.301
Pod : 9

DeepLight

This image enhancement utility increases the operational reliability of automated vehicles operating in severe weather or low-light conditions. The underlying single neural network clears away visual degradation caused by heavy rain, fog, and midnight shadows to rescue object detection rates. The framework updates baseline security cameras and autonomous sensor streams, creating robust visual navigation safeguards across all weather conditions.

Subject: 42028 Deep Learning and CNN Room : CB11.04.301
Pod : 8

SoccerLab

This athletic analysis package delivers automated tactical metrics to mid and lower-tier sports organizations without requiring premium licensing fees. Running cost-effective YOLO models on standard match footage, it breaks down team formations and performance metrics easily. The lightweight pipeline democratizes professional-grade sports metrics, eliminating the need for expensive stadium tracking systems.

Subject: 42028 Deep Learning and CNN
Room : CB11.04.401
Pod : 4

Natural Language/ Audio Processing Projects

MailDefender

This security interface equips enterprise networks against social engineering vectors by implementing an explainable phishing detection workflow. The engine screens incoming emails for urgency indicators or domain anomalies and presents transparent risk justifications to administrators. The unified dashboard provides structural tools to safely quarantine, review, and release suspicious corporate correspondence, making corporate communication audits transparent and accountable.

Subject: 41079 Computing Science Studio 2
Room : CB11.04.301
Pod : 1

MedTriage

This conversational health assistant processes user age, gender, symptom duration, and severity markers to safely manage institutional clinical workloads. The system evaluates patient data to output a structured 1-to-7 triage urgency index ranging from self-care to emergency dispatch. The automated algorithmic architecture mitigates hospital over-crowding metrics by efficiently steering users toward optimal local clinical tiers in real-time.

Subject: 41079 Computing Science Studio 2
Room : CB11.04.401
Pod : 3

FairSpeech

This voice recognition toolkit targets and corrects accent-based transcription bias within modern automated speech communication software. By fine-tuning Whisper language architectures across distinct cultural groups, the software narrows transcription error imbalances to promote equity. The workflow optimizes both real-time transcription and automated translation layers for culturally diverse regions across Australia, ensuring reliable, inclusive digital access for historically underrepresented linguistic communities.

Subject: 41079 Computing Science Studio 2
Room : CB11.04.401
Pod : 5

Filterpass

This real-time synthetic audio detection app prevents phone-scam voice fraud during live conversations. Built on an optimized, ultra-lightweight audio algorithm, the software runs directly on standard smartphones to spot deepfake speech signatures through phone line static. The detection engine is engineered to run three times smaller than existing models to process incoming audio streams instantly without call lagging. The system provides accessible edge protection, empowering everyday users to safely verify who is actually on the line.

Subject: 41079 Computing Science Studio 2
Room : CB11.04.401
Pod : 3

Multi-Modal Fusion Projects

UniForm

This document ingestion platform accelerates business administrative pipelines by automatically parsing tabular information across 8,000 layouts. Powered by LayoutLMv3, it guarantees a 90% recognition rate while preserving data privacy by running entirely on-premise. The plug-and-play architecture allows enterprise teams to instantly extract structured information from complex forms without requiring individual template retraining.

Subject: 42028 Deep Learning and CNN
Room : CB11.04.301
Pod : 3

SafeSense

SafeSense is a passive safety system which updates traditional manual panic buttons by executing automated multi-modal sensor classification on wearable devices. The machine learning architecture evaluates continuous user telemetry to automatically distinguish between five unique distress states. By identifying categories like physical injury or emotional trauma silently, the system coordinates immediate help without active user interaction.

Subject: 41030 Engineering Capstone
Room : CB11.04.300
Pod : 2

Meet Our Judges

Photo_Koerner

Roman Koerner

DroneShield, SFAI Team Lead

Ramya

Dr. Ramya Rajendran

Data Specialist, Allianz

Headshot-git

Gitarth Vaishnav

DroneShield

HaiYan

A/Prof. Hai Yan Lu

University of Technology Sydney

Meet Our Industry Speakers

Organizing Committee

A/Prof. Nabin Sharma

Organizing Chair and Lead

AIShowCase (1)

Debrit Bhattacharya

Casual Academic

Image (1)

Radhika Verma

Casual Academic

Jason-pq2i6sayf7tx8qfxqug0anij9xtn5mgp0uh9sxa5ok

Jason Do

Coodinator, Engaged Learning

sarah-pq2i7a5w12iddbpzuk5x410ak9dm7vflfavhx6joec

Sarah Rodriguez

Student Engagement Officer

Supported By

DroneShield