Clastify logo
Clastify logo
Exam prep
Exemplars
Review
HOT
We just launched question banks, notes & flashcards: biology, chemistry, physics
Background

Computer Science EE Research Question Generator

Use the tabs below to generate a new Computer Science EE idea or evaluate your current research question.

0/5 used

Sample Computer Science EE Topic Ideas

Browse these sample topics to get inspired, or scroll up to generate your own custom ideas based on your specific interests.

Easy

What is the relative runtime (average and worst-case in ms) and peak memory usage of quicksort versus mergesort when sorting arrays of 10k, 100k and 1M random 32-bit integers implemented in Java on the JVM?
Suggested Approach

Start by planning an experiment that directly answers the research question as written: compare quicksort and mergesort implemented in Java on the JVM for arrays of 10k, 100k and 1M random 32-bit integers, reporting average and worst-case runtimes (ms) and peak memory usage. Implement both algorithms yourself (or use minimal, clearly annotated implementations) so you can control pivot choice and allocation behavior. Keep the rest of the program identical for both algorithms: same random generator seed options, same array creation code, same JVM options. Use fixed JVM flags for all runs (document -Xms, -Xmx, GC settings) and run each configuration on the same machine to avoid hardware variability. Prepare two kinds of inputs: a large set of independently generated random arrays for measuring average-case behaviour, and carefully chosen adversarial inputs to elicit worst-case behaviour for each algorithm (for example, inputs that trigger quicksort’s worst-case with your chosen pivot strategy, and pathological shapes for mergesort if you include in-place variants). State your hypothesis but do not change the research question. When collecting data, control for JVM warm-up and garbage collection. Run a warm-up phase of repeated sorts before timing, discard warm-up measurements, then perform many timed trials (at least 30 per size and per algorithm) to calculate reliable statistics. Measure elapsed time with a high-resolution timer (System.nanoTime) and convert to ms; record peak heap usage with Runtime.totalMemory() minus Runtime.freeMemory() during the sort and consider using external tools (jcmd, jstat, or a profiler) to corroborate peak native memory. Log per-trial values, CPU affinity if possible, and CPU temperature/environmental notes. Keep implementation details in an appendix and include full code, compiler/JVM version, OS, and hardware specs so others can replicate your results. Analyse results with clear statistical summaries and visualisations: report mean, median, standard deviation, confidence intervals and the single worst measured runtime and peak memory for each combination (algorithm × input size × input type). Interpret differences in terms of algorithmic behaviour, constant factors, memory allocation patterns (e.g. mergesort’s additional arrays vs quicksort’s stack usage), and JVM effects like JIT optimisations and GC pauses. Discuss limitations (measurement error, hardware, implementation choices) and conclude by tying empirical findings back to theoretical expectations, explaining why observed average and worst-case behaviours match or deviate from theory. Ensure your write-up follows EE structure, cites sources for JVM behaviour and algorithm theory, and places code and raw data in appendices for examiner verification.

Read more


Relevant Exemplars
View 100+
Efficiency of the Interpolation and Jump Search Algorithm’s runtime in different array sizes based on the positioning of the key

Medium

Evaluating whether a ResNet-18 convolutional neural network or a Vision Transformer (ViT-small) achieves higher top-1 accuracy and lower single-image inference latency when classifying the CIFAR-100 dataset on a Raspberry Pi 4 running PyTorch: which model is better for low-power edge image classification?
Suggested Approach

Begin by framing your research question explicitly in the introduction and background sections: explain what ResNet-18 and ViT-small are, why CIFAR-100 is an appropriate benchmark, and why Raspberry Pi 4 with PyTorch represents a realistic low-power edge platform. Summarise relevant theory concisely—convolutional inductive biases, transformer patching, parameter counts, FLOPs, and typical trade-offs between accuracy and latency—and cite recent primary sources and model papers. Define the dependent variables (top-1 accuracy and single-image inference latency) and any secondary measures you might record (model size, memory usage, CPU/GPU utilization, temperature, and optionally power draw). State your hypothesis without changing the research question, then justify your experimental choices so the reader understands why these metrics and the chosen hardware/software stack are meaningful for edge deployment.

Design a clear, reproducible methodology section that a reader could replicate on another Raspberry Pi 4. Document model acquisition (pretrained weights or training from scratch), dataset preprocessing and train/validation/test splits for CIFAR-100, and any fine-tuning steps performed on-device or off-device. Describe how you will measure top-1 accuracy (fixed test set, identical preprocessing) and measure single-image inference latency (define warm-up runs, number of repetitions, statistical aggregation—mean, median, standard deviation—and use high-resolution timers). Control variables: ensure same PyTorch version, quantization state, input resolution, batch size of 1 for latency, CPU governor and thermal conditions, and consistent power conditions. If possible, include an energy measurement method (USB power meter or INA219) and note how you will log temperatures and system loads to separate thermal throttling effects from model differences. Provide pseudocode or flowcharts in the appendix and include the exact commands and versions used in a reproducibility checklist.

When analysing and writing results, present accuracy and latency with confidence intervals and use paired statistical tests or bootstrap resampling to show significance of observed differences. Visualise trade-offs (scatter plot of accuracy vs latency, table of resource metrics) and discuss sources of error or bias—dataset representativeness, transfer learning choices, and measurement noise. In the discussion and conclusion link findings back to the research question: interpret which model is better for low-power edge image classification and under which operational constraints (e.g., strict latency vs slightly higher power/size). Finally, reflect on limitations, suggest practical deployment recommendations, and include complete code, configuration files, and raw logs in an online appendix to satisfy IB expectations for methodology transparency and reproducibility.

Read more


Medium

A comparison of B-tree indexing and Log-Structured Merge-tree (LSM-tree) indexing for a key-value store under a write-heavy workload of 1,000,000 16-byte keys on a consumer NVMe SSD, measured by sustained write throughput (ops/s), read latency (ms), and write amplification ratio?
Suggested Approach

Begin by treating the research question as fixed: a comparison of B-tree indexing and LSM-tree indexing for a key-value store under a write-heavy workload of 1,000,000 16-byte keys on a consumer NVMe SSD, measured by sustained write throughput (ops/s), read latency (ms), and write amplification ratio. In your introduction and background, succinctly explain the internal mechanics of B-trees and LSM-trees, define the three metrics precisely (how you measure sustained write throughput, how you measure read latency including percentiles, and how you compute write amplification), and justify why a consumer NVMe SSD and 16-byte keys are relevant to the investigation. Use literature to explain expected trade-offs (e.g., random-write cost vs sequential writes, compaction overhead), cite sources for SSD behavior and indexing theory, and present a clear hypothesis about relative performance before you run experiments. Keep technical definitions concise but complete so the reader can understand later results without ambiguity. Design a rigorous methodology section that you can reproduce and explain in the essay. Implement or configure two comparable key-value systems: one using B-tree indexing and one using an LSM-tree (examples: SQLite/LMDB-like B-tree and RocksDB/LevelDB for LSM), ensuring identical data models, key/value sizes, and concurrency. Control variables strictly: same NVMe device, same filesystem or raw device mode, identical OS and kernel settings, same buffer/cache behavior (document cache drop methods), and the same write-heavy workload pattern (specify ratio read/write, request distribution, and total ops = 1,000,000 inserts/updates). Automate workloads with a benchmark harness (e.g., YCSB or custom tooling) to produce sustained load while logging ops/s, per-operation latencies (track mean and tail percentiles), and IO/write bytes to compute write amplification (host writes vs. device writes). Run multiple independent trials, warm-up and steady-state periods, and collect sufficient samples to compute confidence intervals; document instrumentation and any scripts so results are reproducible. When analysing and writing results, use clear tables and graphs to compare throughput, latency distributions, and write amplification; show both aggregated and time-series views to reveal transient compaction events or stalls. Apply basic statistical analysis (means, medians, standard deviation, and 95% confidence intervals) and discuss practical significance, not just statistical significance. Relate observed behavior back to theory: explain how compaction frequency, write buffering, and SSD write amplification contributed to the results, and acknowledge limitations (single device model, key size choice, implementation differences). Conclude by answering the research question directly, reflecting on reliability, and suggesting realistic follow-up experiments. Throughout, document code, configurations and raw logs in appendices or a repository and use in-text citations for all claims about algorithms and SSD behavior.

Read more


Hard

Assessing the performance and energy cost of AES-GCM versus ChaCha20-Poly1305 for encrypting 1 KB messages on an ARM Cortex-M4 microcontroller, measured by encryption throughput (KB/s), latency per operation (ms), and energy consumption per byte (mJ/byte) using a hardware power monitor?
Suggested Approach

Begin by planning your experimental design around the research question exactly as written: you will compare AES-GCM and ChaCha20-Poly1305 on an ARM Cortex-M4 encrypting 1 KB messages, measuring encryption throughput (KB/s), latency per operation (ms), and energy per byte (mJ/byte) with a hardware power monitor. List dependent variables (throughput, latency, energy), independent variable (cipher choice), and controlled variables (message size, key size, peripheral states, MCU clock frequency, compiler optimization level, supply voltage, temperature). Choose or implement reliable, well-tested libraries for both algorithms (document versions and any hardware acceleration used), and prepare a repeatable test harness that timestamps operations on the MCU and triggers the power monitor. Calibrate the power monitor and verify timing accuracy with a known reference; record sampling rate, measurement uncertainty, and the method used to compute energy (integrating instantaneous power over each encryption operation). State a clear hypothesis and include safety and ethical notes about accurate reporting and reproducibility.

Execute the experiment methodically: freeze all controlled variables and run large numbers of trials (e.g., hundreds) for each cipher to gather stable statistics. For each trial record start/stop timestamps on the MCU for latency, count bytes for throughput, and capture the power trace for energy-per-operation calculations. Make sure to warm up the MCU to steady state if temperature affects results, and monitor temperature during runs. Vary no other settings except the cipher implementation; if you test both software-only and hardware-accelerated AES-GCM, report them as separate conditions. Log raw data, instrument settings, compiler flags, and full code listings in an appendix. Use automated scripts to process data so you can recompute results if needed and to reduce human error.

When analysing and writing, present methods and results with enough technical detail to allow replication: include pseudocode, circuit diagrams for the measurement setup, sampling frequency, power integration method, and error analysis. Report mean, standard deviation, and confidence intervals for throughput, latency, and energy per byte; use appropriate statistical tests to show whether observed differences are significant. Discuss sources of systematic and random error, trade-offs between speed and energy, and the effects of implementation strategies. Structure the essay following the EE format (introduction, background, methodology, results, analysis, conclusion, bibliography, appendices) and ensure your background explains the cryptographic modes and ARM microcontroller characteristics needed to interpret results. Conclude by answering the research question directly, acknowledge limitations, and suggest future work. Include all data, code, and instrument logs in appendices for examiner verification.

Read more


Medium

How do TCP BBR and TCP Cubic compare in terms of average throughput (Mbps), end-to-end latency (ms), and fairness (Jain’s index) for 720p video streaming over a simulated cellular link with 20% packet loss and 50 ms base RTT using ns-3?
Suggested Approach

Start by translating your research question into a precise experimental plan for ns-3. Decide on the exact traffic model that represents 720p video streaming (constant bit-rate or realistic adaptive bitrate trace) and document why that model maps to the question. Keep the cellular link parameters fixed to the values in the research question (20% packet loss, 50 ms base RTT) and explicitly state any additional link parameters you choose (bandwidth, queue size, loss model, bufferbloat behavior). Use the ns-3 implementations of TCP BBR and TCP Cubic available in the version you run; note the exact ns-3 release and any patches or external code you add. Define dependent variables (average throughput in Mbps, end-to-end latency in ms, fairness using Jain’s index) and independent variable (TCP congestion algorithm). List and justify controlled variables (packet size, video flow count, background traffic patterns, simulation duration). Plan for sufficient repetition using different RNG seeds to produce confidence intervals — aim for at least 30 runs per scenario if computationally feasible, or justify a smaller number with a power calculation; record seed values and script invocations so your experiment is reproducible.

Implement the simulations and collect raw traces (per-packet timestamps, sequence numbers, ACKs) so you can compute average throughput, one-way or RTT-based latency statistics, and Jain’s index for fairness across flows. Post-process traces with reproducible scripts (Python, R) and save intermediate data files. Compute summary statistics with means, standard deviations, and 95% confidence intervals; where distributions are non-normal, use non-parametric tests (Mann–Whitney U) or bootstrap confidence intervals. Present results visually with CDFs for latency, time-series or boxplots for throughput, and bar charts with error bars for Jain’s index; include sample sizes and p-values or effect sizes to support claims. Examine transient behavior (ramp-up periods) as well as steady-state, and discuss interactions between loss, RTT, and congestion control mechanisms that explain observed differences.

When writing the essay, follow the Computer Science EE structure: concise introduction and motivation, precise statement of the research question, technical background explaining TCP BBR and Cubic and the metrics used, detailed methodology (including parameter tables and code references), results with figures and statistical analysis, critical discussion, limitations, and conclusion. Cite primary sources for BBR and Cubic algorithms, ns-3 documentation, and any video traffic models. Include annotated code and full simulation commands in an appendix or repository link for reproducibility. In the discussion, interpret practical implications for 720p streaming over lossy cellular links, acknowledge limitations (simulator fidelity, model assumptions), and suggest further work without changing the research question. Ensure all claims are supported by data and clear statistical reasoning.

Read more


Generate the Best Computer Science EE Research Questions

Our AI quickly transforms your keywords into unique, high-quality research questions. The process is simple: Select your subject, enter a few keywords, or leave the field blank for instant inspiration. Click 'Generate' to start browsing ideas.

Master Your Coursework, Maximize Your Grade.

Gain unlimited AI topic generations & evaluations, unlimited access to all exemplars, examiner mark schemes, and more.