We proposed Cross-ViT with a special Masksemble Block in order to create discriminative image features. The Masksemble layer estimates the uncertainty of a given dermatoscopy image that plays a crucial role in cancer identification, and then it is passed to the Cross ViT network for the classification task.