SciProgCentre/kmath

Signal processing + FFT

Opened this issue · 30 comments

I think that i would really like to work on this, however i would like to be able to integrate my code fairly efficiently and with as little integration problems with the current code base as is possible. If there are any guidelines that you would like me to follow such as documentation, runtime efficiency or anything of that sort please let me know. This goes for the git comment in my commits as well. I have some experience using numpy and scipy, but mostly in respect to keras and using tensorflow as the backend. So if it is alright with you i would like to be able to confer with someone that is better versed in the algorithms needed to be employed, prior to coding up a lot for naught. As i am only an undergrad currently i hope it wouldnt be too bad to ask quite a few questions, thank you.

Very glad to hear it. Please proceed at pace you like, I do not have time to work on that in near future.
The general guildlines are following:

  • It should be a stand-alone mpp module with JVM and optionally JS support (we will include native later).
  • It should follow API-first approach. That means that our task is to create convenient API and adapt the implementation or implementations to work with it, Not the other way around. So starting with algorithms maybe not the best choice.
  • For complicated algorithms it is better to use existing platform-specific implementations instead of moving everything to kotlin at once (it could be done later). For simple algorithms and crucial API it is better to re-implement them.
  • For release we want a documentation page with general structure and examples (I do not follow it myself yet).

For this specific task, It would be good to have a proposal first. Maybe in KEEP-math. We do not want to blindly duplicate numpy approach. Kotlin is much more expressive than python and could allow some very important features like coroutine-based online processing, which means the streaming API. Just few minutes ago I was thinking that something similar could be done to generalize @thomasnield statistics API.

There are some efforts to create a multiplatform wrapper for tensorflow. IT makes sense to investigate it and use as a back-end.

The discussion is quite welcome here as well as in slack

@kapot65 @Zelenyy, you are welcome to contribute to the discussion here and in slack. We can also invite Alexander Kiselyov, I do not know his account name.

You can get an invitation here

You should start with just documenting the use case. What is the initial data and what do you do with it. I've almost finished the coroutines-based API that probably could be used both for regular array processing (like in python) and streaming processing simultaneously. I hope I will complete the initial design until the end of the week.

OK.

Compute 1 dimensional discrete Fourier Transform

Kmath.fft.fft( a, n=null, axis=-1,norm=None)

  • Parameters:
    • a: input array (Not sure how would be nice to have complex)
    • n: int (optional) Length of transformed axis of output, if smaller than input input is cropped, if larger input is padded with zeros, defaults to length of axis used
    • axis: int (optional) Which axis is used to compute the FFT, defaults to the last axis
    • norm: optional (string defaults to “”) normalization node
  • Output: complex ndarray(List?)
    input padded or truncated properly transformed along the axis
  • Raises:
    IndexError: if axis goes out of bounds of the size of a
  • Computes 1 dimensional n-point discrete Fourier Transform (DFT)
  • Possibly use a different formula than numpy uses, due to efficiency of the algorithm

Please let me know if this works for the one function,also if their is a better way to contact you so i dont have to block up this page with text that isnt too pertinant. Also where i would need to post the final proposal and approach i will take, i know you said KEEP-Math however im uncertain as to where. Thank you for your time

I've fixed the formatting in your post to make it more readable. Some questions though:

  • What is the type of a? If you have axis it is probably many-dimensional, but later you say that it is one-dimensional.
  • Could it be used in streaming mode or you need the whole array at once? Could it be spitted to sub-regions for parallelizatiion?
  • Does it make sense to have different algorithms for this?
  • What is the general workflow for the result? Does it make sense to store the reference to initial array?

For now you are mixing a lot of different problems into one. I would suggest to start with simple document describing what you want to do from mathematical point of view (including formulas). Then we can understand how to better implement it. Or you can start with implementation and we can integrate it.

The basic FFT is one-dimensional. Python uses secondary axis for vectorization which is neither requires, nor appreciated in Kotlin. We can do vectorization ourselves with functions or cycles.

There are existing Java implementations like commons-math which could be used for a start, we need only to invent convenient kotlin-ish API for it.

The API is rather simple it is just complex array transform. I can very easily add it to kmath-commons module. The actual question is whether we need something more like operation chaining etc.

Vectorization in python is required to achieve performance with C back-end. You keep kotlin code much closer to your actual math notation. Anyway, performance optimization is the next step. First we need to understand what to do in math notation.

OK. Seems good. This specific operation seems to be done on the whole array. It could be done in different ways. We can just write an extension over complex buffer, or we can add a context for such operation. The first solution is preferred if we will ever have only one such transformation and never will need state.

@Zelenyy, is it possible that we will need some kind of state for it? It seems so, especially for some kind of parametric wavelets or whatever. Also, do we also preserve initial buffer size?

I am thinking about design, not implementation. But yes, more examples would be good.

I've added the transformation implementations from commons-math here. The usage is following:

val myComplexBuffer: Buffer<Complex>// take it from somewhere
val res = Transformations.fourier().invoke(myComplexBuffer)

I've also finished streaming support for buffer processing and added helper functions to organized the workflow. Currently, if you have a streaming producer of Complex numbers from somewhere, you can split it into fixed size chunks and process run through fft like this:

val producer: Producer<Complex>
val resultProducer: Producer<Complex> = producer.chunked(200).FFT()

It is not palatalized yet, but could be easily added in the future. For now, the whole cycle is still untested, so I would very grateful if someone would create some test material like file with initial numbers and final expected numbers or some algorithm to check the result.

Please disregard the previous message the function I used to attempt to validate the output was the wrong function