Skip to main content

Questions

Configurations and their Impacts

  1. (1.5 points) Using the simulation data provided to you, compare the x264_v (vectorized) and x264_s (unvectorized) number of ticks required to complete.
    • Is there a difference?
    • Does this match your expectations?
    • Explain.
  2. (1.5 points) Compare the number of instructions required to run conv1 scalar and conv1 vector with vlen = 256 and elen = 64.
    • Is there a difference in the number of instructions required to terminate?
    • Does this match your expectations?
    • Explain.4.
  3. (2 points) Compare the number of ticks required to run conv1 scalar and conv1 vector with vlen = 256 and elen = 64.
    • Did the number of ticks required to execute the benchmark change between the two benchmarks?
    • Does this match your expectations?
    • Explain.

Optimal vector configurations

  1. (3 points) Perform some experiments varying the vlen and elen parameters on the x264 benchmark. - Is there an optimal vector register size for this benchmark?
  2. (2 points) Run some experiments with conv1 on the simulator, varying the elen and vlen parameters.
    • Is there an optimal elen/vlen value?
    • Explain why this value is optimal, or if you did not find an optimal value, explain why the benchmark is not amenable to such an analysis.
    tip

    You may find it helpful to consult the source code of the benchmark.

Assembly Analysis

important

We provide the assembly files for all the executables:

  • See the archive on the course announcements
  • conv1
  • conv2

If you are interested in cross compiling the conv benchmarks yourself, refer to the cross compilation tutorial.

For the following two questions, make sure to attach your annotated versions of the assembly code to your submission.

  1. (3 points) Consult the assembly code for the x264_s benchmark for both the vectorized version and the unvectorized versions.
    • Identify three instances of vector operations in the vectorized version.
    • Identify the corresponding non-vectorized code snippets.
    • Comment on the portions of code you identified. Particularly, you should consider:
      • The number of operations and
      • Any additional guards added by the compiler (if any).
  2. (3 points) Consult the assembly code for the conv1 and conv2 benchmarks.
    • Identify three instances of vector instructions.
    • Are there differences in the vector instructions between the two benchmarks? You can choose to look at number or semantics.
    • Explain.
  3. (1 point) Comment on the difference in binary size between the x264_v and x264_s benchmarks.

Conceptual Questions

  1. (2 points) When making a chip, we need to be considerate of the chip area allocated to a given feature and what workloads that particular chip might execute. With that in mind:
    • Give two examples of workloads that will benefit more from a vector unit than a complex branch-prediction unit.

Course Questions

  1. (1 point) The best time to be critical and to offer constructive suggestions about an assignment is soon after you complete it. Please include in your report the following:

    1. A narrative description of any difficulties or misunderstandings that made the assignment unnecessarily difficult, or that led you to waste time.
    2. Suggestions of changes that could make this assignment more interesting, more relevant or more challenging in future editions of this course.
  2. (2 points) Make use of proper typesetting in the report and overall presentation style. This includes the use of properly referenced figures, tables, and graphs (with descriptive axis titles and proper identification of units of measurement) where applicable.

  3. (2 points) Meeting notes. See the collaboration requirements for what to include.