20130311, 07:15  #2234 
Romulan Interpreter
Jun 2011
Thailand
2634_{16} Posts 
We had this discussion few times in the past, I still remember the last try. It turns out that one would need a good way to eliminate the algebraic and intrinsic factors first, and do the sieving then. It is not difficult to change the mfaktc if one don't care about missing some factors, and the goal is just to find some other factors, no matter if they are in order or not. If one is going to make a new mfaktc, I would suggest introducing a new flag, like "allowcomposite" or whatever, and the program would not check if the exponent is composite when the flag is present (but still check if it is odd, otherwise we have trouble with 3,5 (mod 8), and more classes to parse, different logic). Modifying the program to always allow composite exponents could result in futile work when someone makes a typo, for example. What I want to say, is that is better to keep the check for primality on the default options of the program, but allow odd composites if some "special" flag is present. For these odd composite, the program would work exactly the same way as it does for prime exponents. Of course, it will miss the algebraic factors.
Last fiddled with by LaurV on 20130311 at 07:29 
20130311, 07:23  #2235  
Bemusing Prompter
"Danny"
Dec 2002
California
2^{2}·3^{2}·67 Posts 
Quote:


20130311, 10:18  #2236 
"Nancy"
Aug 2002
Alexandria
2,467 Posts 
I'm not familiar with the mfaktc code at all, or I could probably make the changes myself. If the changes to allow composite exponents are implemented, maybe I can add some code to skip useless classes according to quadratic character, if such changes are welllocalized.

20130311, 17:48  #2237 
"Oliver"
Mar 2005
Germany
10001010111_{2} Posts 
Remove the check for prime exponent is easy, aswell as the classes stuff is easy, too, as long as we'll keep 420/4620 classes. But the (CPU)sieve needs to be reworked, currently there is no code to remove primes from the sieve base, but this would be needed for composite exponents, otherwise there will be an endless loop in the offset calculation. Anyway, it seems feasible if someone wants to do so.
Oliver 
20130311, 18:30  #2238 
"Nancy"
Aug 2002
Alexandria
2,467 Posts 
Btw, what is the relationship between mfaktc and mmff? I haven't kept up with developments, and digging through a 100page thread seems a daunting task... is one a superset of the other?

20130311, 18:43  #2239 
Bemusing Prompter
"Danny"
Dec 2002
California
2^{2}·3^{2}·67 Posts 
mfaktc is for TF'ing GIMPSclass numbers, and mmff is for double Mersenne and Fermat numbers. The mmff source code is largely based on that of mfaktc.
Last fiddled with by ixfd64 on 20130311 at 18:43 
20130312, 00:11  #2240 
Just call me Henry
"David"
Sep 2007
Cambridge (GMT/BST)
2^{5}·5·37 Posts 

20130317, 17:03  #2241 
"Oliver"
Mar 2005
Germany
2127_{8} Posts 
So finally I got my hands on a Titan (temporary), too.
Seems that I've underestimated to boost clock. So with stock clockrates the Titan is the fastest GPU for mfaktc, but it wins only by a small margin compared to the old GTX 580. There might be a very small performance increase for the Titan once I test the new funnel shift instruction. The barrett_{76,77,79} kernels don't make use of multiword shifts expect for the initialization but barrett_{87,88,92} do a multiword shift in each iteration. Oliver 
20130317, 17:08  #2242 
"James Heinrich"
May 2004
exNorthern Ontario
6656_{8} Posts 

20130320, 20:00  #2243 
"Oliver"
Mar 2005
Germany
457_{16} Posts 
Yepp, I was right: the funnel shift gives a small advantage.
A quick hack using stock mfaktc 0.20 code and barrett_87 for testing on a Tesla K20 (CUDA 5.0) Code:
base 300.8 GHzd/d added code generation for sm_35 298.1 GHzd/d using funnel shift in barrett_87 308.9 GHzd/d So barrett_87 beats now barrett_77, only barett_76 is faster on GK110. For the current TF wavefront the impact is even lower because we do TF to 2^{73} there. But hey, it is an improvement... Oliver 
20130323, 02:16  #2244 
Aug 2002
2×3×5×277 Posts 
We have been messing around with the extremely confusing "EVGA Precision X" software. There are so many options it is ridiculous!
Anyways, we were messing with the memory clock setting. By default on our GTX690 it is 1502.3MHz. We lowered this to 1252.8MHz and the performance did not change! The temperatures and voltages do not change, either. Does this make sense? Would running the memory slower like that make it less likely to have a flipped bit? FWIW, the GPU clock is 1058.2MHz. The performance changes a lot when we mess around with that! 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
mfakto: an OpenCL program for Mersenne prefactoring  Bdot  GPU Computing  1680  20210913 17:01 
The P1 factoring CUDA program  firejuggler  GPU Computing  753  20201212 18:07 
grmfaktc: a CUDA program for generalized repunits prefactoring  MrRepunit  GPU Computing  32  20201111 19:56 
mfaktc 0.21  CUDA runtime wrong  keisentraut  Software  2  20200818 07:03 
World's seconddumbest CUDA program  fivemack  Programming  112  20150212 22:51 