Homepage
Homework
Introduction to Computer Systems - Floating Point Format

Introduction to Computer Systems - Floating Point Format

Engage in a Conversation

This question has been solved

Your ECE friends over at Hamerschlag Hall are looking to implement hardware for a new computing system and have asked you for your help in choosing a specification for their 16-bit floating point value. CourseNana.COM

CourseNana.COM

In addition to 64-bit (FP64) and 32-bit (FP32) specs, the IEEE 754 standard also specifies a 16-bit (FP16) floating point number. The 16 bits are divided as follows: CourseNana.COM

- 1 sign bit CourseNana.COM

- 5 EXP bits CourseNana.COM

- 10 FRAC bits CourseNana.COM

CourseNana.COM

Google Brain, however, created their own Brain Floating Point Format (BFLOAT16) for use in their deep learning systems. The 16 bits are divided as follows: CourseNana.COM

- 1 sign bit CourseNana.COM

- 8 EXP bits CourseNana.COM

- 7 FRAC bits CourseNana.COM

CourseNana.COM

a. Describe the tradeoffs between the FP16 and BFLOAT16 formats, i.e. for the ranges (largest and smallest positive values) and step size (distance between neighboring numbers). No need to calculate anything, just a qualitative explanation using the specs of each format. CourseNana.COM

CourseNana.COM

b. List any problem(s) that there might be with converting certain numbers in FP16 to BFLOAT16 and vice versa. CourseNana.COM

CourseNana.COM

c. Now think about how converting from FP16 to FP32 would work. What would you need to do to the EXP field and FRAC fields of the FP16 number? CourseNana.COM

CourseNana.COM

d. Google Brain was formed in 2011 to leverage massive computing resources to perform deep learning research. Knowing that they need to do a ton of number conversions, why do you think they chose to create their own 16-bit floating point number that uses exactly 8 EXP bits? (Hint: How many EXP bits does FP32 have?) CourseNana.COM

Introduction to Computer Systems - Floating Point Format

Get in Touch with Our Experts