[go: up one dir, main page]

Enhancements to MatrixProduct

This patch is a bit huge so I will try to summarize and comment:

  • Adapting code to gcc 10.
  • Generic code style and performance enhancements.
  • Adding PanelMode support.
  • Adding stride/offset support.
  • Enabling float64, std::complex and std::complex.
  • Fixing lack of symm_pack.
  • Enabling mixedtypes.
  • Adding std::complex tests to blasutil.

Comments:

Most tests that were failing are now ok, except mixingtypes which needs to be implemented. The ones that are failing are doing so because there's no storePacketBlock for Incr != 1, the implementation for other increments is pretty slow so I'll leave up to the community to decide if we end up doing it or not.

I'm aware that I used a union against community standards but unfortunately the appropriate approach produces a strict-alignment warning and it's pretty cumbersome, I don't see a way to fix this for now.

Edited by Rasmus Munk Larsen

Merge request reports

Loading