![]() ![]() In other words, if your data are fabricated and you fail to say as much in any way, and there is some kind of expectation that the data are not fabricated or, worse, you claim that the data are gathered in some non-fabricated sort of way, then that is "fraud". The #1 way to avoid commission of fraud in science is to simply be honest and forthright about the nature of your data and your expectations. The term "fraud" explicitly includes an aspect of having covered something up or having outright lied. No, but, it is important to be clear about the source of any dataset and your a priori expectations as the experimenter when reporting your results on any dataset. "does that give the impression of fraudulent data?" The word "contrived" really sums up nicely the fact that I have chosen the data with "good results" in mind, a priori. This is because such results provide evidence that my algorithm can work out well, but provide only very weak evidence that one might expect the algorithm to work out well in general. If I have cherry-picked data (to the point of actually making it up) specifically to make my algorithm work out well, I say so. ![]() This is because I don't want anyone to make the mistake of thinking that I pointed my algorithm at some arbitrary synthetic dataset I found lying around and it really worked out well. In such circumstances, I am fond of the term "contrived" along with an explanation of my expectations for the data. In other words, I invented data for the specific purpose of getting "good results". However, oftentimes whenever I use this type of data, I have invented the data with the specific intent of showing off the capabilities of my algorithm. If I have found the made-up dataset lying around and have pointed my algorithm at it in a confirmatory manner, then the word "synthetic" is fine. I use a different word depending on the manner in which I use the data. I guess it could have some negative connotations but I've heard it often enough that it doesn't register negatively at all for me.Ī quick google search for 'fake data' turns up a lot of results that seem to be using the term similarly:Īnd there's even a fakeR package, which suggests that this is relatively common: ![]() I've encountered the term 'fake data' a fair amount. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |