Tim Cheng
Feb 1, 2022

Cross entropy penalizes wrong class significantly better. In terms of KL and BCE difference, you can change the KL divergence to mean as well. Different losses might have different learning rates, so empirical testing should tell you the ratio of the losses

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Tim Cheng
Tim Cheng

Written by Tim Cheng

Oxford CS | Top Writer in AI | Posting on Deep Learning and Vision

No responses yet

Write a response