# Clustering of dark matter tracers: generalizing bias for the coming era of precision LSS

###### Abstract

On very large scales, density fluctuations in the Universe are small, suggesting a perturbative model for large-scale clustering of galaxies (or other dark matter tracers), in which the galaxy density is written as a Taylor series in the local mass density, , with the unknown coefficients in the series treated as free “bias” parameters. We extend this model to include dependence of the galaxy density on the local values of and , where is the potential and is the peculiar velocity. We show that only two new free parameters are needed to model the power spectrum and bispectrum up to 4th order in the initial density perturbations, once symmetry considerations and equivalences between possible terms are accounted for. One of the new parameters is a bias multiplying , where . The other multiplies , where , with instead of .) We show how short-range (non-gravitational) non-locality can be included through a controlled series of higher derivative terms, starting with , where is the scale of non-locality (this term will be a small correction as long as is small, where is the observed wavenumber). We suggest that there will be much more information in future huge redshift surveys in the range of scales where beyond-linear perturbation theory is both necessary and sufficient than in the fully linear regime. . (There are other, observationally equivalent, ways to write the two terms, e.g., using

###### pacs:

98.65.Dx, 95.35.+d, 98.80.Es, 98.80.-k## I Introduction

While measurements of galaxy clustering have been around for a long time Groth and Peebles (1977), to the point where the casual observer might think they must surely be almost finished, or at least well-underway, in fact we have barely scratched the surface of the possibilities for measuring large-scale structure (hereafter, LSS, defined in this paper to mean surveys of any tracer of the large-scale mass density field – we will often call the tracer “galaxies”, but it could just as well be quasars Slosar et al. (2008a); Padmanabhan et al. (2008), the Ly forest McDonald et al. (2005a); McDonald et al. (2006); Viel and Haehnelt (2006), galaxy cluster/Sunyaev-Zel’dovich effect measurements (Aghanim et al., 2008), 21cm surveys (Nusser, 2005; Chang et al., 2008), etc.). Measuring LSS should really be regarded as an exciting future probe of cosmology, with growth potential not a priori less than probes with less past success. The reason is simply that we have so far probed only a tiny fraction of the observable volume of the Universe. For example, the largest galaxy redshift survey with density approaching what is needed to fully sample the near-linear regime of clustering, the Sloan Digital Sky Survey (SDSS) Luminous Red Galaxy (LRG) survey Tegmark et al. (2006), probes cubic Gpc/h, or % of the comoving volume at . Figure 1 shows that the fraction of linear regime modes, i.e., easily usable information, probed by the LRGs is even smaller – barely 0.01% of the modes at – because the non-linear scale is smaller at higher .

(For this figure, we have used for the non-linear scale, where is the linear growth factor. The normalization is somewhat arbitrary, depending on one’s definition of the non-linear scale, but changing it only changes the overall normalization of the figure. The redshift dependence is motivated by Crocce and Scoccimarro (2006a); Seo and Eisenstein (2007).)

The high precision of LSS statistics measured using future surveys probing appreciable fractions of the observable Universe Tang et al. (2008); Cimatti et al. (2008); Schlegel et al. (2007); Wang et al. (2008a); Chang et al. (2008); Visbal et al. (2008); Abdalla and Rawlings (2005); Hill et al. (2008); Glazebrook et al. (2005); Blake et al. (2008) will require an unprecedented level of accuracy in our theoretical/phenomenological calculations of predictions for the statistics, if we are to fully exploit the potential of these surveys for measuring fundamental physics/cosmology. On very large scales we can use linear theory, but the scale below which linear theory cannot be trusted at the level of the error bars will become larger and larger (corresponding to a smaller and smaller maximum reliable wavenumber ) as the error bars shrink. The number of Fourier modes in a three-dimensional survey goes like the cube of the maximum usable , i.e., in terms of raw information, extending the usable range of by a factor of 2 is equivalent to extending the volume of the survey by a factor of 8 (for a Gaussian field). As we will see (Fig. 4), the range of scales where corrections to linear theory are small (perturbative), but still statistically significant, can easily be a factor of for future large surveys. The point is simply that we have enormous leverage to extend the value of surveys through modeling improvements that extend the usable range of . For example, if a survey costs 50 million dollars, extending the effectively usable range by a mere factor of 1.3 (say, from to ) would be worth roughly 1000 person-years (at $60000 per year). Phenomenological theory associated with LSS surveys should be viewed not as a typical academic exercise, pursued by a few individuals or small groups because they think it is “interesting”, but instead as an industrial, infrastructure building endeavor, critical to surveys in much the same way as, say, the road up to the telescope.

Better modeling is needed even for present, moderate precision surveys. For example, Sánchez and Cole (2008) shows clearly where the linear bias model Kaiser (1984) that we have been relying on for cosmological parameter estimation for decades is breaking down, by comparing results from SDSS and 2dF galaxies (see also Percival et al. (2007); Swanson et al. (2008)). The power spectra of two different types of galaxies are not related by a simple overall normalization factor (bias) – their ratio depends on scale, even on quite large scales where it was once hoped that linear theory would be good enough. This was not completely unanticipated, however, Sánchez and Cole (2008) also shows that the ad hoc fitting formula of Cole et al. (2005), that has been used recently to try to account for quasi-linear galaxy clustering, does not work well, and these problems lead to disagreement between cosmological parameters inferred from different galaxy surveys (see also Hamann et al. (2008)). Clearly, we have a lot of theoretical work to do if we want to fully exploit future, much more precise, LSS data.

For measurements of the baryonic acoustic oscillation (BAO) feature Seo and Eisenstein (2007); Eisenstein et al. (2007); McDonald and Eisenstein (2007); Eisenstein et al. (2005); Seo and Eisenstein (2003); Eisenstein et al. (1998); Eisenstein and Hu (1998); Shoji et al. (2008); Seo et al. (2008); Parkinson et al. (2007), ad hoc fitting formulas very carefully calibrated by simulations may be sufficient, but measuring other physics that produces less distinctive signatures in the power spectrum, e.g., redshift-space distortions aimed at constraining dark energy McDonald and Seljak (2008); White et al. (2008); Percival and White (2008); Wang (2008), or measurements of the shape of the power spectrum aimed at constraining modified gravity Bañados et al. (2008); Acquaviva et al. (2008), neutrino masses Pritchard and Pierpaoli (2008); Kiakotou et al. (2008); Brandbyge et al. (2008); Gratton et al. (2008); Fogli et al. (2008, 2007); Saito et al. (2008); Takada et al. (2006); Lesgourgues and Pastor (2006); Slosar (2006), inflation Takada et al. (2006), etc. De Lope Amigo et al. (2008); Mota et al. (2008); Takada (2006), will require well-motivated, rigorous descriptions of the relation between galaxy and mass density, i.e., bias models. In other words, better LSS theory will substantially enhance the constraining power of BAO-oriented surveys, by allowing the use of non-BAO information Jeong and Komatsu (2008).

Bias modeling can be roughly divided into two approaches (excluding attempts to simulate galaxies from something resembling first principles Yoshikawa et al. (2001); Blanton et al. (1999), which can be useful as a guide/spot-check for other methods, but are unlikely to be accurate and efficient enough to use for interpretation of precision statistics any time soon): The first approach might be called a bottom-up approach, where one starts with a model for how individual galaxies sit in the local small-scale mass density field (most recently almost always based on galaxies sitting in dark matter halos, but earlier on peaks or other features), and then computes large-scale clustering by including the large-scale correlation of the relevant small-scale density feature. The other approach might be called top-down, or perturbative, where one starts from the fact that large-scale fluctuations are small and expands a completely unknown relation between galaxies and mass, with generally infinite freedom (except typically for the assumption of locality, relative to the scale of observations) into a Taylor series in the density perturbations, where the coefficients of the first few terms in the series become the free parameters of the model (the main point of the renormalized bias scheme of McDonald (2006) was to demonstrate how this separation of scales can be done in an organized way — see Bernardeau et al. (2002) for a general review of LSS perturbation theory).

This paper takes the perturbative approach, but most recent work has been based in some way on dark matter halos (e.g., Yoo et al. (2008); White and Padmanabhan (2008); Zheng and Weinberg (2007); Yoo et al. (2006); Tinker et al. (2005); Neyrinck et al. (2005); Seljak (2000, 2001); Seljak and Warren (2004); Taruya and Suto (2000); Mo et al. (1997); Bond and Myers (1996a, b, c)). A strong foundation for halo models is the expectation that, with enough work, it should be possible to make accurate numerical simulations of the large-scale clustering of halos within a given cosmological model Reed et al. (2008) (it is much more difficult to fully quantify this clustering to the point where one does not need to make halo models based on the halos in full simulations, but that is only necessary for convenience). Unfortunately, we can see these halos only through the coarse probe of gravitational lensing Seljak et al. (2005), and it is not straightforward to determine the relation between halos and the more easily observable galaxies. The halo models therefore specify a “halo occupation distribution” (HOD) for the galaxies, i.e., a recipe for populating halos with galaxies. The hope of these models is that they can determine the HOD using information deeper into the non-linear regime than possible using the more general, less predictive, perturbative approach that we will discuss, but this is a difficult game. To be reliable, models that populate halos within a full numerical simulation must include enough freedom in the method for populating halos to cover all realistic possibilities. Models that further rely on analytic calculations for the clustering of halos introduce another level of complexity and possibility of error Smith et al. (2008, 2007).

To appreciate the small-scale complexity that we will bundle into a few perturbative bias parameters, it is useful to review the recent work toward understanding the details of halo models. The standard HOD assumption is that the number of galaxies in a halo is some relatively simple function of the mass of the halo. Even these relatively simple HODs have free parameters Zheng and Weinberg (2007). There is observational evidence that this form of HOD works qualitatively very well Tinker et al. (2008); however, the assumptions involved clearly can not be perfect. Gao et al. (2005) showed that the clustering of halos of a fixed mass depends significantly on the time when the halo formed (see also Wechsler et al. (2006); Reed et al. (2007); Harker et al. (2006)). This phenomenon is often called assembly bias. When combined with the possibility that the galaxy population within halos of a given mass can depend on the halo formation time, this means that it is necessary for the HOD to depend on more parameters than just mass. Croton et al. (2007) demonstrated this explicitly using semi-analytic models for galaxy formation (see also Zu et al. (2008)), and found that accounting for formation time or halo concentration in addition to mass explains only a fraction of the effect. Li et al. (2008) found that the magnitude and mass-dependence of the assembly bias depends on the definition of halo formation time (different definitions capture different aspects of the history of the halo). Angulo et al. (2008) extends these results to higher order statistics. Wetzel et al. (2007) showed that the clustering of massive halos depends on concentration in addition to mass, and also recent history of mergers. The simulations of Wang et al. (2007); Hahn et al. (2008) suggest that the relation between formation time and clustering for small halos is due to the effect of tides in high density regions suppressing later growth of small halos. The simulations and analytic calculations of Dalal et al. (2008a) suggest that at low masses assembly bias is again related to high density regions suppressing late-time accretion, and at high masses the effect is related to the curvature around the initial peak that grows into the halo. The simulations of Davis and Natarajan (2008) show that the clustering of halos at high redshift also depends significantly on their angular momentum, at fixed mass. Finally, simulations even show a population of halos that were once subhalos within a larger halo, but were ejected by interactions Wang et al. (2008b). Not surprisingly, the ejected halos do not cluster in the same way as other halos of the same mass. Generally, the idea that the mass density field breaks up neatly into halos, containing galaxies, which retain little information about their formation process, is a great qualitative way to picture the formation of structure, but we should not forget that it is a picture, not a calculation. Another assumption of typical halo models is that the distribution of satellite galaxies within dark matter halos follows the mass density profile, but this has been only roughly justified Macciò et al. (2006); van den Bosch et al. (2005); Nagai and Kravtsov (2005); Berlind et al. (2003). Explanations of why these issues are not fundamental problems for the HOD approach make the argument that the effects are not large enough to matter now, but not that they will not in the future Zheng and Weinberg (2007).

In the face of any uncertainty about whether the small-scale halo model is sufficient, a precision measurement of fundamental physics/cosmology that is consistent with prior expectations may be believed, but a truly new, unexpected result will not be. This is only a very meager form of progress. The same kind of thing can be said about the perturbative approach – as long as there is any question of whether the bias description is complete, the results will not be believed in any important situation. We believe that it is reasonable to hope that the perturbative bias approach can be made relatively airtight, as long as one does not try to push it beyond its range of validity. This paper is an attempt to make progress in that direction.

General understanding of large-scale clustering, independent of specific small-scale models for the dark matter tracer, has been developing gradually. Coles (1993) showed that if the galaxy density is a general function of the local mass density, and the mass density field is assumed to be Gaussian, the asymptotically large-scale galaxy correlation function will be proportional to the mass correlation function (except for special cases of the local function). Coles (1993) also showed that, under the same conditions, the galaxy power spectrum may go to a constant as (even if no white noise is introduced by hand). Fry and Gaztanaga (1993) introduced the perturbative bias model in the form that we will follow, where the galaxy density perturbation is first written as a completely general function, , of the mass density perturbation , and then the function is Taylor expanded, with the unknown coefficients in the series becoming the bias parameters, , i.e.,

(1) |

with the mass density given by gravitational perturbation theory. Note that the observation that the first order term in this series describes simple scale-independent linear bias does not guarantee that higher order terms cannot cause large-scale deviations from this form. Scherrer and Weinberg (1998) showed, starting with the same Taylor series form of bias, that if the mass clustering is hierarchical, then , even if the local bias relation is applied on scales where the fluctuations are not small. The large-scale bias factor found by Scherrer and Weinberg (1998) was an infinite sum of terms proportional to powers of the mass density variance, a foreshadowing of the renormalized bias approach we follow in this paper McDonald (2006). They went on to show that the linear bias relation holds even if the local mass density does not determine the galaxy density uniquely, but only determines a random distribution for the galaxy density (with the randomness in that distribution independent from point to point). Finally, Scherrer and Weinberg (1998) showed that the galaxy power spectrum obeys the linear bias relation on scales similar to the correlation function, except the small-separation part of the correlation function, which deviates from linear bias, will contribute an added constant to the power spectrum (see also Dekel and Lahav (1999); Durrer et al. (2003); Seto (1999)), a foreshadowing of the noise renormalization that we will employ McDonald (2006). Heavens et al. (1998) found similarly that higher order corrections in straightforward gravitational perturbation theory starting from the local Taylor series model for bias produce terms that on large scales look like modifications of the linear theory bias or additional shot-noise. Generally, it has been pretty well established that linear bias plus white noise is the correct model for very large scale galaxy clustering Matsubara (1999); Narayanan et al. (2000); Coles et al. (1999), barring the introduction of long-range non-gravitational effects which essentially introduce deviations from this form by hand. McDonald (2006) put these results together into a neat computational package, by employing renormalization ideas from quantum field theory Peskin and Schroeder (1995) (some similar ideas were present in Taruya and Soda (1999)). The inconvenient results of Scherrer and Weinberg (1998); Heavens et al. (1998), that higher order calculations can affect clustering statistics on arbitrarily large scales, and that these corrections are sensitive to the assumed small scale smoothing (cutoff), are rendered observationally irrelevant by absorbing the inconvenient pieces into renormalizations of the existing bias parameters (including the noise level). This approach clears the way for pushing, in a systematic way, beyond the very large-scale, purely linear, regime and into the information-rich smaller scales where higher order corrections are non-negligible, and understanding the smoothing/cutoff issue becomes critical. Jeong and Komatsu (2008) showed that this approach describes clustering in simulations very well.

Remarkably, for all of the work on both the halo-based and perturbative approaches to bias, neither have generally been adopted, beyond the papers in which they are proposed, for use in the main stream of LSS power spectrum measurement and cosmological parameter estimation Komatsu et al. (2008); Tegmark et al. (2006); Percival et al. (2007); Seljak et al. (2006). In fact, even the proposers generally have not pushed their methods through to the point of making comprehensive parameter measurements (see Abazajian et al. (2005) for an exception). The widespread use of the demonstrably inadequate (when extrapolated beyond its original purpose) fitting formula of Cole et al. (2005) should really be seen as an embarrassing failure of the LSS theory community. This paper will, unfortunately, continue this legacy of failure, but with the hope that it can soon be rectified.

In this paper, we will improve the Eulerian bias model by allowing for dependence on the local velocity divergence and shear and the tidal tensor in addition to density. The reason to expect such dependence at some level is simple: two patches of space with the same final density did not necessarily follow the same path to reach that density, and that difference in history may affect the galaxy density at the time of observation. In perturbation theory up to some finite order, however, the entire density history of a patch is reconstructible given a finite number of local quantities like the the velocity divergence and tidal tensor. This raises the hope that a completely unique, general, bias model can be constructed, covering all possibilities for large-scale clustering with a finite set of bias parameters. (One can always imagine unavoidable obstacles to this, e.g., long-range non-gravitational effects like inhomogeneous reionization affecting clustering Babich and Loeb (2006); Coles and Erdogdu (2007), however, to the extent that something like this is important on a given scale, very high precision cosmology is probably simply impossible on that scale.) While the primary philosophy of this paper is that any possible form of large-scale clustering should be included in the model, unless it can be compellingly rejected, there is actually a lot of evidence that these new forms of bias are needed, related to the assembly bias phenomenon seen in simulations Hahn et al. (2008) or observational correlations between galaxy properties and their environment Lee and Li (2008).

In a very interesting paper, Matsubara (2008a, b) points out that a perturbative bias model assumed to be local in initial Lagrangian density produces results distinct from the model assumed to be local in final Eulerian density (see also Catelan et al. (2000, 1998)). While Matsubara (2008a) presents this as an advantage of Lagrangian PT, which is supposed to be a more correct way to look at bias, we believe that it is better to say that this represents a deficiency in the development of one or both approaches, not a conceptual problem with either. As a first approximation, it may be more accurate to assume that bias is local in the initial Lagrangian density than the final Eulerian density, but neither assumption can be rigorously justified. Barring the unlikely proof that one approach is fundamentally superior to the other, one criteria for believing future very high precision cosmology measurements should be that Lagrangian and Eulerian PT give equivalent answers in regimes where the calculations converge, once all possible freedom is included in each version of the bias model. We prefer to work with the Eulerian model simply because it is expressed in terms of quantities that are generally more directly observable. This paper will implicitly address the differences between Lagrangian and Eulerian PT raised by Matsubara (2008a, b).

Note that, while we primarily discuss results in terms of the power spectrum, nothing about the perturbative approach intrinsically requires one to go to Fourier space. It is simple to obtain the correlation function by Fourier transforming the power spectrum, but it is also possible to do all of the same calculations, from scratch, in configuration space.

The plan of the rest of the paper is as follows: In §II we discuss the primary new extensions to the Eulerian bias model that we will work out fully in this paper: including dependence on the local large-scale tidal tensor and velocity divergence and shear. In §III we briefly discuss some further extensions that are implied by the same line of thinking, related to redshift-space distortions, short-range non-locality, and non-Gaussianity of the primordial perturbations, although we will not fully develop them. Finally, in §IV we will give some conclusions and thoughts on directions for future work.

## Ii A more general Eulerian bias model

In this section we lay out a baseline extension to the model of galaxy bias as dependent on local density only. In §II.1 we discuss the variables we will allow the galaxy density to depend on, and in §II.2 we compute statistics of galaxy clustering using these variables.

### ii.1 Independent variables

This subsection seeks to answer the question: In general, in principle, in perturbation theory, what can the galaxy density depend on?

Everything we know about LSS at a given time in standard perturbation theory (PT) is contained in the dynamical variables , where is the mass density at position and is the mean mass density, and , where is the peculiar velocity (see Bernardeau et al. (2002) for a review of LSS PT – note that we will make the usual approximation that the Einstein-de Sitter PT results can be used for other models as long as the linear growth factor is replaced by the growth factor in the desired model). Because the velocity field is curl-free, it can be derived from , i.e., (, and represents the usual potential integral, or in Fourier space). To allow for the non-locality (in the density field) introduced by gravity, we will also consider dependence of the galaxy density on the local potential field, , which can always be derived from using the Poisson equation. Allowing dependence on and , in spite of the fact that the system is entirely determined by and , can be understood as allowing for history dependence of the number of galaxies in a given patch of space, i.e., these quantities tell us about the path the patch took to get to the density and velocity divergence that it has.

A homogeneous change in should not be observable, which suggests that the galaxy density should only depend on . Furthermore, a homogeneous gravitational force shouldn’t be observable either, suggesting that we should use . Therefore we define:

(2) |

where we have removed the trace of because it is redundant with (note that we are absorbing all of the spatially constant factors in the Poisson equation into the definition of , i.e., – we will make a similar re-definition of to make in linear theory). For compactness, we have defined the operator

(3) |

Similarly, a homogeneous velocity field should not be observable, suggesting that galaxy density depends on velocity through . Because at linear order, is redundant with at linear order, so it simplifies things in perturbation theory to use their difference for our independent variables, i.e., to define

(4) |

and

(5) |

The difference variables and are non-zero only at 2nd order.

Now, the galaxy density will depend on , , , and , but it can’t depend directly on anything but a scalar quantity. This is because, assuming homogeneity and isotropy, we can only have constant, scalar, bias parameters. For example, the general Taylor series for a function that depends on a small tensor is

(6) |

In general, each element of could be independent, but this is inconsistent with isotropy. The only consistent possibility is . In this case, only enters the Taylor series. Similar arguments apply to higher order terms.

By construction, and . We can construct products, up to 3rd order in the initial perturbations, , , and ( is 4th order). It turns out that, at 2nd order in PT, , we use a variable constructed to be zero at both 1st and 2nd order in standard PT, . This suggests that, in place of

(7) |

This definition makes non-zero only at 3rd order. Note that we can not redefine in terms of because this would require terms like . To summarize, our galaxy density will (naively) be a Taylor series involving the following eight quantities:

(8) | |||||

This shows why standard linear theory bias, , is sufficient in the truly linear regime: all other independent scalar quantities we can form are higher order.

Finally, our model, which now starts with , will be extended to include general dependence on a mean-zero Gaussian white noise variable , i.e., , to allow for stochasticity and shot-noise in the galaxy density-mass density relation. This approach is new relative to past work where a noise variable was simply tacked onto the end of the Taylor series. We will Taylor expand around , just like the other variables, treating epsilon as similar in size to , and including all higher order terms. This may appear strange, and actually will not affect power spectrum calculations at all, but we will see when we compute the bispectrum that this is a compact way to include the fact that Poisson sampling of the density field actually affects the bispectrum, in contrast to Gaussian noise Peebles (1980); Smith et al. (2008).

A Taylor series in these quantities, up to 3rd order in the initial perturbations, is

(note that the factors of and serve no real purpose, because the ’s are essentially arbitrary and could be redefined to include these factors).

One might ask at this point: Why not add more derivatives, e.g., terms like or products of ? Also, why not make the dependence non-local, i.e.,

(10) |

where can be any position, not just the position where we are measuring the density. It turns out that these things are related, as we will discuss further in §III.1. As long as the non-locality is short range, it can be easily represented by a controlled series of higher derivative terms like . Terms like , which we will not consider, introduce new long-range operators, beyond the one already present in the construction of the gravitational potential.

One might also wonder about the eigenvalues of , Desjacques and Smith (2008): Are they not additional scalar quantities that are linear in the perturbation amplitude, and thus loopholes in the argument that linear order bias can only depend on ? In three dimensions, they are hard to write down explicitly, but the two dimensional version is informative: . We see that these quantities are in some sense the same order as , but they are not well behaved analytic functions of . This is illustrated by considering a similar, but simpler to understand, possible term, . At , is not differentiable, and it becomes especially obvious how unphysical this must be when we observe that local physics has no particular reason to see the mean density of the Universe as a special value. Similarly, it seems unlikely that it is physically correct for the dependence of galaxy density on (for ) to make a sharp change of direction at (which is just the transition from a tensor extended in the 1 direction to the 2 direction), as it would if we included terms linear in the eigenvalues. It is undoubtedly possible for the galaxy density to depend on these eigenvalues – the argument here is simply that this dependence should be higher than linear order. Our parameterization actually already includes this dependence very directly: , i.e., is the sum of squares of the eigenvalues.

The bottom line is: We stick to the terms that are obtained in a Taylor series in , , and , with only short range (relative to the scale of observations) non-locality in the dependence of galaxy density on these quantities. We leave for the future the question of how completely general this approach is.

### ii.2 Statistics

The mean galaxy density is, to 3rd order in the initial perturbations,

(11) |

where , , and . Redefining all the coefficients after division by gives

#### ii.2.1 Galaxy-mass cross-spectrum

For simplicity, we start by calculating the mass density-galaxy density cross-spectrum, i.e., , which is

See the Appendix for definitions of , , and . is the non-linear mass power spectrum. with no subscript always refers to the linear theory mass power. Note that the term works out to exactly zero, so the parameter has been rendered irrelevant.

As we found in McDonald (2006), some terms like appear which are best treated as renormalizations of the linear theory bias, i.e., by a redefinition like . As discussed in McDonald (2006), the un-smoothed density variance may not be literally infinite, depending on the power spectrum, but it will be large, and sensitive to the deeply non-linear regime where all of our calculations are meaningless. It is best to think of the original as an un-observable “bare” parameter, with the observable linear bias factor being largely un-related to it as the sum of many higher order terms which are generally much larger. This idea that the values of the parameters of large-scale galaxy clustering are generated by small-scale, higher order effects is physically reasonable, or even expected — after all, if there were truly only small, linearizable, perturbations in the Universe, there would be no galaxies.

The term associated with has an interesting new feature. In the limit, we find

(14) |

Like the term, for example, this looks like a renormalization of the linear bias; however, unlike the term, here there is non-trivial dependence as one goes to non-zero . This case provides an opportunity to demonstrate how the renormalization works more clearly. Defining , , and

(15) |

we have

(16) |

where . gives the weight function over which one must integrate to obtain the bias term. Figure 2 shows a plot of .

We see that is constant as . This leads to the constant result as , and is clearly undesirable as it represents sensitivity to arbitrarily small, highly non-linear scales. The solution is to subtract the result, i.e.,

(17) |

where

(18) |

now looks like a smoothing kernel, with no sensitivity to power for , i.e., (the factor was chosen to make , i.e., to look like the Fourier transform of a mass conserving smoothing kernel). The change in bias due to this term at observed scale is quite simply proportional to the variance on scale , as defined by the weighting function .

A similar procedure must be followed with the second term, i.e.,

(19) |

All of the other terms go to zero for small . As in McDonald (2006), we now define the observable, renormalized, linear bias as the sum of bias-like terms

(20) |

Note that this is the only appearance of the parameters , , and , so they are no longer needed. In fact, the random noise variable has completely disappeared, just like it would have if it was only included as a single term at the end of the Taylor series.

The result simplifies even more when we find, somewhat surprisingly, that the three terms proportional to in Eq. (II.2.1) are exactly proportional to each other, after renormalization and angle-integration. This means that we can define one merged term that accounts for all of them, i.e.,

where

(22) |

Note that the inclusion of the term in this redefinition is convenient but not at all necessary, because it is perfectly well-behaved, and the redefinition does not remove all appearances of the parameter . The reason to include this term in the redefinition is that, presumably, a fit to data using and will show less degeneracy between the two parameters if the functions they multiply do not have substantial components which have identical form.

Finally, we define normalized parameters , , and to produce the power spectrum

(23) |

The final expression has two new terms relative to the version from the -only Taylor series in McDonald (2006). The term associated with is more like a true -dependent bias, in the sense that the power at a given is still proportional to the matter power spectrum at that , just multiplied by a -dependent factor; while the other term, associated with , mixes power from a range of scales. These terms come from the correlation of the linear and second order parts, respectively, of the mass density field with the galaxy field. Figure 3 shows the effect of all the terms, for a typical CDM model, at .

We see that the term is actually quite small relative to the others, for similar values of the bias parameters. In this paper the parameter values are completely arbitrary, simply chosen to make the different effects comparable in size in the more easily observable galaxy-galaxy power spectrum, , where the effect of the term is substantially larger (Fig. 4). The term, on the other hand, can have a larger effect on , relative to its effect on .

Note that we could have, completely equivalently, left as our independent variable while redefining to make it non-zero only at 3rd order. All differences in the resulting equations would be numerical factors which can be removed by redefining the parameters. The least trivial looking of these changes would be changing term in Eq. (23) to ; however, the simple relation between and (Eq. 44) means that this change is equivalent to redefining and .

#### ii.2.2 Galaxy-Galaxy power spectrum

We now compute the cross-power spectrum between two types of galaxies, each with a set of bias parameters represented by the letters and . The power spectrum of a single type of galaxy is of course obtained by taking equal bias parameters for each type.

The first two lines in Eq. (II.2.2) are the terms proportional to the linear bias factor of one type of galaxy or the other, and are thus essentially just the result re-written (including already all of the same renormalizations). The third line contains the new terms due to cross-products of the 2nd order bias factors. The last line contains cross-terms involving the random variables and , which we have taken to be possibly locally correlated with cross-power spectrum , and cross-variance .

In the limit the new terms in the third line of Eq. (II.2.2) are not zero, but are -independent, i.e., they look like locally correlated white noise:

It is interesting to note that these shot-noise-like terms in the power spectrum come from the same terms in the original galaxy density Taylor series which produced a non-zero contribution to the mean density. This is consistent with our expectation that white noise must be associated with non-conservation of the field. As in McDonald (2006), we can absorb these constant terms into the observable noise matrix, but first we need to discuss the -related terms.

We define the lowest order -related term in the last line of Eq. (II.2.2) to be . If we were only calculating to lowest order, this would be the usual galaxy shot-noise. The rest of the terms are also constants (-independent), so they can be simply interpreted as renormalizing this noise matrix, i.e., in spite of the apparent large number of new terms, there is actually nothing new here at all. After renormalization, the result is a completely general effective noise matrix for the galaxies, i.e., some choice of the bias parameters can produce any mathematically legitimate matrix. Altogether, the formal redefinition is:

The result that we should have a general free noise matrix is insensitive to assumptions about the form of the matrix – we could start by assuming that and are perfectly correlated (i.e., there is really only one random variable), or perfectly independent, and in either case the renormalizations would generate the extra freedom. We do require some intrinsic randomness, i.e., we cannot start with and rely entirely on the noise matrix generated by the density fluctuations (if we want to allow for different types of galaxies to be uncorrelated, or correlated in a way different from that given by the right-hand side of Eq. II.2.2). This is somewhat unsatisfactory as the randomness in the initial density field must ultimately be the source of randomness in the outcome – we speculate that higher order density field terms will produce a general noise matrix, so that eventually there will be no need to give a seed variance. Note that one should not think too hard about where a noise matrix that is nearly diagonal with elements equal to the inverse mean number density of galaxies () comes from in this picture (aside from observing that it is possible). The terms that appear on the right hand side of Eq. (II.2.2) do not need to add up to the observable noise in any literal sense, because the observable noise will contain other, possibly even larger, terms at higher order. Eq. (II.2.2) just shows why it is legitimate to drop the undesirable terms (that are non-zero as , including all of the -related terms) in the PT calculation, i.e., because they are redundant with a free noise matrix. One should remember that the idea of Poisson sampling, i.e., the model for noise power, was never more than an apparently quite accurate guess – Smith et al. (2007), for example, found deviations for dark matter halos.

We are left with the final power spectrum:

This equation is not as complicated as it may look, including only a few simple building blocks: , , , , and (in ).

Figure 4 shows examples of the auto-power spectrum for a single type of galaxy.

We see that the effects of each term are somewhat different. The term has a greater influence at larger relative to smaller scales than the term. Those two terms can have either sign, but the term is essentially always negative. Note that the power spectrum is not linear in the bias parameters, so the outcome when all of the parameters are varied is more complex than a simple sum of the examples we show. The increase due to the term actually reaches a maximum (for ) at , before declining again as the negative quadratic part comes to dominate (this transition is apparent as the flattening at the high end in the figure).

#### ii.2.3 Bispectrum

The bispectrum is the three point correlation function Kulkarni et al. (2007); Gaztañaga et al. (2005); Pan and Szapudi (2005); Gaztañaga and Scoccimarro (2005); Frieman and Gaztanaga (1994) in Fourier space. It vanishes if the density fluctuations are Gaussian. The bispectrum can be used to measure non-Gaussianity in the primordial density distribution, if any, and non-Gaussianity induced by non-linear gravitational evolution and bias Smith et al. (2008); Nishimichi et al. (2007); Scoccimarro (2000); Verde et al. (1998); Matarrese et al. (1997). Sefusatti et al. (2006); Sefusatti and Scoccimarro (2005) show that the bispectrum is a very powerful addition to the power spectrum for general cosmological parameter constraints, especially on the primordial power spectrum amplitude and slope. In this section we show the form of the galaxy bispectrum in our generalized bias model. Only 2nd order terms in the density perturbations are needed to construct the bispectrum to 4th order. By definition bispectrum takes the following form

(28) |

where means that only closed triangular configurations are non-zero. In our calculations, we assume that the primordial density fluctuations did not have any signature of non-Gaussianity. The galaxy bispectrum is then

(29) | |||||

where we note that the angle between any two of the vectors is determined by the length of the third. We have defined and , and is the noise power. Here we see directly the convergence between Eulerian and Lagrangian bias that we were hoping for – the new term introduces the extra configuration dependence in the bispectrum found for Lagrangian bias by Catelan et al. (2000); Bernardeau et al. (2002); Catelan et al. (1998). Note that Feldman et al. (2001) actually compare Lagrangian vs. traditional (density-only) Eulerian bias in fits to the PSCz bispectrum, but did not have enough statistical power to distinguish them (Eulerian bias was slightly preferred).

We see now the purpose in the introduction of the full structure of -related terms. These terms have produced exactly the structure needed to correctly represent Poisson noise in the bispectrum. If the galaxies were a Poisson sampling of the underlying biased density field, we would have and Peebles (1980); Smith et al. (2008). Even the appearance of the extra new free parameters, and , is necessary, as Smith et al. (2008) showed that galaxies in simple halo models do not obey Poisson sampling exactly, but instead follow the more general form we find here, with the values of and depending on the details of the model (in fact, our introduction of this treatment of noise was entirely motivated by Smith et al. (2008)).

A reduced bispectrum, which does not depend on the mass power spectrum amplitude, is often written as

(30) |

The reduced galaxy bispectrum, to leading order, is then,

(31) |

where is the reduced bispectrum of the mass density perturbations. The noise terms, which we have dropped from this presentation of , undermine the elegance of using . We suspect that it will be more straightforward to interpret noisy observations using a simultaneous fit to and , rather than going through .

Figure 5 shows some examples of the reduced bispectrum and bias terms.

has been discussed as a means to measure