I understand that mutex is a protection mechanism to prevent data race, not an inhibitor of concurrency. However, if multiple threads only work with mutex data then I think it becomes closer to sequential execution -- work with mutex by locking it, wait for the mutex to be unlocked and cycle continues.
Probably I am thinking too narrowly or miss the big picture of mutex.
You're right to be concerned. The scenario you're picturing is a common beginner mistake when trying to introduce threads to a problem designed without one: put the data in a mutex, start N threads, and have all those threads access the mutex. That indeed gets you zero parallelism.
In order to benefit from threads, there must be some work done while not holding the mutex — work done on data that is either not shared, or not mutable.
There will always be some delay from using the mutex even a tiny bit, but you want to make it as small as possible. Use the mutex as a communication device, not a place where the data to work on is kept. You can also use many mutexes (for different parts of your data), or communication tools that are not mutexes, like atomics and channels, that aren't oriented around doing work while excluding other threads.
Arguably in most programming languages the mistake will be even worse since a mutex is not required for the program to compile, with the result of either data races or race conditions and ultimately weird bugs rather than performance degradation.