Tanzir Ahmed

 Tanzir Ahmed

Tanzir Ahmed

  • Courses2
  • Reviews9
Jul 10, 2020
Textbook used: No
Would take again: No
For Credit: Yes


Not Mandatory



Prof. Ahmed clearly knows his stuff. He has a bit of an accent and he sometimes doesn't answer questions well. He gives 3 quizzes, midterm, final, and 4 programming assignments. His newborn kid and COVID made this semester rough though. I would recommend taking this over a standard sem and not on summer.

May 11, 2020
Textbook used: No
Would take again: Yes
For Credit: Yes





Dr. Tanzir is one of the most knowledgeable lecturers I've ever had. Despite his stuttering issues, he went above and beyond in clarifying subjects I was having trouble with, and I have never seen any other professor help out as aggressively through Piazza as he does. Be aware that the assignments will be demanding, so be prepared. When in doubt, seek assistance from the professor and TAs.


Texas A&M University College Station - Computer Science


  • 2001

    Bachelor of Science (BS)

    Computer Science

    Bangladesh University of Engineering and Technology

  • 3.82



    Doctor of Philosophy (Ph.D.)

    Computer Science

    Texas A&M University

  • Linux









    Visual Studio

    High Performance Computing





    On the Performance of MapReduce: A Stochastic Approach

    MapReduce is a highly acclaimed programming paradigm for large-scale information processing. However

    there is no accurate model in the literature that can precisely forecast its run-time and resource usage for a given workload. In this paper

    we derive analytic models for shared-memory MapReduce computations

    in which the run-time and disk I/O are expressed as functions of the workload properties

    hardware configuration

    and algorithms used. We then compare these models against trace-driven simulations using our high-performance MapReduce implementation.

    On the Performance of MapReduce: A Stochastic Approach

    Many BigData applications (e.g.


    web caching

    search in large graphs) process streams of random key-value records that follow highly skewed frequency distributions. In this work

    we first develop stochastic models for the probability to encounter unique keys during exploration of such streams and their growth rate over time. We then apply these models to\nthe analysis of LRU caching

    MapReduce overhead

    and various crawl properties (e.g.

    node-degree bias

    frontier size) in random graphs.

    Modeling Randomized Data Streams in Caching

    Data Processing

    and Crawling Applications

    Exponential growth of the web continues to present challenges to the design and scalability of web crawlers. Our previous work on a high-performance platform called IRLbot [28] led to the development of new algorithms for realtime URL manipulation

    domain ranking

    and budgeting

    which were tested in a 6.3B-page crawl. Since very little is known about the crawl itself

    our goal in this paper is to undertake an extensive measurement study of the collected dataset and document its crawl dynamics. We also propose a framework for modeling the scaling rate of various data structures as crawl size goes to infinity and offer a methodology for comparing crawl coverage to that of commercial search engines.

    Around the Web in Six Weeks: Documenting a Large-Scale Crawl

    High Performance MapReduce

    Shared-memory high performance MapReduce

    much like Phoenix from Standford

    but external-memory and capable of solving MapReduce tasks with arbitrary input size. In this project

    various MapReduce design choices were considered

    developed and bench-marked. The best design (arguable) is the one that uses hash tables for sorting and selection tree for multi-way merge of the sorted runs. While using a large number of threads (on 16 CPU cores)

    the main challenge to scaling is memory-wall. Specially in systems with multiple physical sockets and NUMA latency

    the problem worsens. We work around many such problems to achieve a sorting speed of 332 million keys/s (8 byte keys) which is much better than existing benchmarks.



    Structured Data Systems Ltd

    Ranks Telecom Ltd.

    Texas A&M University

    Developed and improved various parts of the telecom billing system. This system consisted of downloading call records from the switching servers

    processing them in an Oracle database using PL/SQL and provisioning customers/numbers. I

    as member of a team

    developed a number of value-added services (e.g.

    top-up account balance

    ringtone and logo downloads etc)

    and the data billing system. I also developed numerous modules of the enterprise management software (PHP/MySQL/Oracle on Linux) used by the employees for their day-to-day job.

    Ranks Telecom Ltd.

    Visiting Assistant Professor

    My research encompasses two fields: characterizing graph crawl experience and high-performance MapReduce systems. \n\nI have developed stochastic models for various aspects of a crawl (e.g.

    degree distribution

    uniqueness of the nodes) on random graphs. By characterizing these aspects

    predictive models for the unseen portion are inferred. In addition

    this has interesting implications in various graph theoretic problems (e.g.

    rumor spreading etc). \n\nMy other research focus is developing and analyzing high-performance MapReduce programs (using C/C++) for multiple-core SMP systems capable of handling arbitrary BigData problems (e.g.

    inverting a 7 TB web graph downloaded by Internet Research Lab

    TAMU). This is complemented by stochastic models for intermediate-data (sorted runs written to disk

    to be merged in the next step) and the total run-time of such programs

    both of which are of significant interest to the BigData community. Over all

    the objective is to examine various MapReduce design/algorithm choices and find out the best mix of them that can process data at the fastest possible rate permitted by the hardware (number of CPU cores and their speed

    memory and I/O bandwidth).

    Texas A&M University

    Software Engineer

    Worked on an existing GIS based software using Visual C/C++.

    Structured Data Systems Ltd