Zhijun (Nicholas) Huang

Bachelor's in Financial Engineering @ CUHK

Dual M.S. in Information Systems @ Cornell

About Me

Zhijun Huang is a Master's student at Cornell University (Cornell Tech), specializing in Information Systems and Information Science. He is passionate about the applications of AI and Machine Learning across diverse industries, with hands-on experience in AI for Software Engineering and AI for Finance. His research interests encompass AIOps, FinTech, and a broad spectrum of AI applications, including Deep Learning, Natural Language Processing, and Computer Vision. Recently, he has been actively engaged in learning and developing Large Language Models.

Your Professional Image

Personal Projects

The projects showcased below are mostly self-initiated and inspired by my personal interests. They represent my ongoing efforts to learn, experiment, and apply technology to areas that intrigue me, while continually improving my skills and understanding.

SentenSnap (Original Version)

An interactive web application that combines natural language processing and AI-powered image generation to provide a unique and educational experience. Users can explore random quotes, learn word definitions, and generate artistic images based on sample sentences.

Repository: GitHub Repo | Demo: Video Link

SentenSnap (Deployed Version)

This application is a modified version of the SentenSnap repository, originally built to operate with locally installed large language models (LLMs). The current version leverages Streamlit for its user interface and integrates the Gemini API to deliver a fully deployed, cloud-based experience. While the image generation functionality from the original version has been omitted, this modification focuses on providing seamless access to motivational quotes and interactive word snapshots without requiring local LLM installations. In addition to quotes, this version also offers generated knowledge capsules and book excerpts for learners to read and expand their vocabulary.

Repository: GitHub Repo | Deployed App: APP

Trulingo

Trulingo is a multilingual news aggregator and fact-checking tool that retrieves and processes articles from various news sources. It supports both English and Chinese news sources and performs LLM analysis and comparison using the Gemini API.

Repository: GitHub Repo | Deployed App: APP

MyAndroidLLM

MyAndroidLLM is an enhanced version of the EdgeLLM project, which was originally designed to run large language models (LLMs) on edge devices using React Native. The original project, detailed in the Hugging Face blog post, utilizes llama.rn to load GGUF files efficiently. This revised version builds upon the foundation provided by the EdgeLLM repository and introduces several key improvements.

Repository: GitHub Repo

Tank of Tanks

A 4-player local multiplayer tank battle game built with Python and Pygame. Players control mini tanks, spin to aim, and fire bullets to eliminate opponents. The game features dynamic gameplay with tank collisions, spinning mechanics, health bars, and an exciting victory screen. The last tank standing wins! This project is inspired by the original Tank of Tanks Battle game by Orion Game.

Repository: GitHub Repo | Deployed App: APP

Modified Snake Game

This is an exciting variation of the classic Snake game, implemented in Python using the Turtle graphics library. In this version, you control a snake that must eat numbered food items while avoiding a pursuing monster.

Repository: GitHub Repo

Number Guessing Game

A simple yet engaging Number Guessing Game with a graphical user interface built using Python and Tkinter.

Repository: GitHub Repo

Patent & Publication

Industry Experience

  • BILL Operations, LLC
    Draper, UT, USA | May 2024 - Aug 2024
    • Extracted and analyzed billions of transactions using efficient SQL queries and various visualizations like word clouds.
    • Built pipelines for traditional ML models (e.g., XGBoost, Autoencoder) for transaction categorization; achieved >90% accuracy.
    • Designed and implemented LLM for transaction categorization; maintained >90% accuracy and contributed a ~32% lift in coverage.
  • Huawei Technologies Co., Ltd
    Shenzhen, China | Jun 2022 - Jun 2023
    • Developed, trained, and tested hypergraph-based clustering and spectrum-ranking ML models on over 70GB of industrial log data.
    • Built an online, real-time stream-clustering pipeline to address latency issues and scale the model in production.
    • Co-inventor of a patent: β€œA Root Cause Analysis System for Software Test Cases” (Publication No. CN116302984A).
  • Shenzhen Research Institute of Big Data
    Shenzhen, China | Sep 2021 - May 2023
    • Won two rounds of undergraduate research awards under Prof. Pinjia He for anomaly detection and root cause analysis research.
    • Built a novel regular-expression-based log parser that outperformed 13 major log parsers on 16 public datasets (accuracy & efficiency).
    • Co-authored a paper on log parsing published in IEEE Transactions on Reliability.
  • Shenzhen Stock Exchange
    Shenzhen, China | Jun 2021 - Aug 2021
    • Crawled textual information from over 30 Chinese financial news websites using the Requests and Beautiful Soup libraries.
    • Conducted sentiment analysis and text classification on financial news data using NLTK; labeled 20,000+ lines of news data.
    • Led a team of interns to retrieve, label, and store company data using SQL and MongoDB.

Core Courses

The table below highlights my core coursework, categorized by education level and subject area.

Education Level Category Course Name
Graduate CS Algorithms For Applications
Graduate CS Machine Learning Engineering (MLE)
Graduate CS Natural Language Processing (NLP)
Graduate CS Computer Vision (CV)
Graduate CS Building Startup Systems (Full Stack Development)
Graduate INFO HCI & Design
Graduate INFO Applied Data & Decision Making
Graduate INFO HCI & Design
Graduate INFO Human Robot Interaction
Graduate MBA Business Fundamentals
Graduate MBA The Business of Gaming
Undergraduate CS Basic Machine Learning (Python)
Undergraduate CS Techniques for Data Mining (Python)
Undergraduate CS Data Structures (Java)
Undergraduate CS Operating Systems (C++)
Undergraduate CS Database System (SQL)
Undergraduate INFO Data Analytics (R)
Undergraduate INFO Data & Knowledge Management (SQL)
Undergraduate INFO Web Analytics & Intelligence (Python)
Undergraduate MATH Discrete Mathematics
Undergraduate MATH Optimization (MATLAB)
Undergraduate FINTECH Fintech Theory & Practice
Undergraduate FINTECH Blockchain & Decentralized Applications
Undergraduate ECON Basic Microeconomics
Undergraduate ECON Basic Macroeconomics
Undergraduate ECON Introductory Econometrics
Undergraduate ECON Game Theory & Business Strategy

Skills & Interests

Skills

Python SQL JavaScript C++ Java R MATLAB Solidity Pandas NumPy Matplotlib PyTorch TensorFlow Keras Scikit-learn OpenAI API Visual Studio Code Linux Vim HTML CSS Algorithms and Data Structures Machine Learning (MLE) Natural Language Processing (NLP) Computer Vision (CV) Full-Stack Development (Flask, React) Database Management Data Analytics

Interests

Semi-Professional Swimming Piano AI for Software Engineering AI for Finance Large Language Models (LLMs) FinTech AIOps